Appropriate role for Context message in query in RAG

egils · July 15, 2024, 4:15pm

Recently, I came across a doubt regarding which role should be used for the “Context” message in a query sent to ChatCompletion API.
Since the very beginning, I was following OpenAI Cookbooks and till today used the “user” role for this message, e.g.
query = [
{“role”: Role.SYSTEM, “content”: prompt},
{“role”: Role.USER, “content”: context},
{“role”: Role.USER, “content”: question},
]

A few days ago just out of curiosity, I asked ChatGPT this question and it came up with the role “assistant”. The reasoning it provided was very sound (you can test yourselves) and I switched my test environment to use “assistant” instead. And so far I could not spot any decrease in answers’ quality.

Questions to OpenAI folks:

which role would you suggest and why (besides what’s in cookbooks)?
please kindly explain why is there a discrepancy between cookbooks and ChatGPT suggestions (also just curiosity :)?

anon22939549 · July 16, 2024, 3:39am

Please read the OpenAI Model Spec

Specifically the section on Roles.

Subject to its rules, the Model Spec explicitly delegates all remaining power to the developer (for API use cases) and end user. In some cases, the user and developer will provide conflicting instructions; in such cases, the developer message should take precedence. Here is the default ordering of priorities, based on the role of the message:

Platform > Developer > User > Tool

Note: Though not included here the assistant role is assumed to be about on par with Tool.

So, depending on how much you want the model to adhere to the retrieval as fact should dictate what role you assign it to.

egils · July 16, 2024, 7:18am

I want model to NOT perceive content of context data as something provided by user, because it is not user’s input but the attempt of the most appropriate documentation I could match using embeddings to help model answer the question. In my case, data in context message contains both - KB chunks (documentation/tutorials) and some examples of similar (user-agent) conversations about the same/similar topics.

I wish context content is treated as a fact hence this should not be user message, right? Then on other hand, system messages are not meant for large context amounts!? Therefore, the assistant role seems the most logical choice. What I’m missing?

anon22939549 · July 16, 2024, 8:53am

Why? There’s no reason why a system message cannot be arbitrarily long.

egils · July 19, 2024, 11:00am

Thank you! Indeed, there is no separate limitation for length of system messages. Changed roles to system for all non-user content and my fidelity tests are passing now even for questions having mixed results previously.
Are there known drawbacks of having multiple system messages within a single query?

Topic		Replies	Views
Contexts with the new turbo end point API	22	6386	September 23, 2023
Different roles in the API and their use cases API	1	9040	April 16, 2024
With what role should external information be flagged as? Plugins / Actions builders plugin-development	6	1676	August 15, 2023
Asking a model to do something without asking as the user API gpt-4 , api	5	1641	February 23, 2024
Understanding Role Management in OpenAI's API: Two Methods Compared API chatgpt , api	4	75354	February 6, 2024

Appropriate role for Context message in query in RAG

Related topics