How about adding a new role for context in RAG applications

,

In Retrieval-Augmented Generation (RAG) applications, it’s not always clear where or how to include context information within the current OpenAI API structure. I think it would be helpful to introduce a new role — something like “context” — or a dedicated API parameter that explicitly separates retrieved context from user messages or assistant replies.

This would make implementations more intuitive and semantically cleaner, especially when combining multiple sources or managing multi-turn conversations where context must be handled distinctly from prompts.

Would love to hear others’ thoughts or see if this is already being considered!

1 Like

It is so obvious that I wrote it in Dec 2023 (and probably more)

It appears the great minds over at OpenAI are not thinking alike.

2 Likes

How are you all inserting RAG context? What role are you using?

If you’re only providing data with function calls, then the answer is clear.

If your conversations are all non-interactive and single-turn then OpenAI seems to prefer placing them in the system instructions.

If you must insert context without input from the LLM, my instinct would be to add a function to get RAG context, pretend like the model called it and add this tool call to your context, respond to it with the context, and then query the model using a tool_choice that prohibits it from calling the function again. This way, you can use function calling roles to supply data without the actual overhead of having the LLM call a function on every turn.

Nice

How do you pretend to call the function and add a tool call?

I can call a function, then add the function_call_output and use that as context.

But what do you mean “pretend like the model called it and add this tool call to your context”

I tried to (in responses api) add a tool_call but it isn’t a valid type. And I’d prefer to not have the latency of a function because I have to wait for the LLM to call it and then provide me with a call_id
I suppose I could just cache an old function call and use its call_id, but not sure if that would work. (I might try that)

Basically, I have context I want with each round of the conversation.
I was putting it in the system role
I have moved it to the developer role. (I have considered putting it in an assistant role)
I had tried to put it in the tool call role (but that didn’t exist) since it is similar to a file_search call. It seems that is what you are mentioning, I just didn’t know how to go about that.

I created an example on the playground.

When you create something in the playground, you can grab the code for it with the “code” button on the upper-right.

curl
curl https://api.openai.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
  "model": "gpt-4.1",
  "input": [
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Hello! What is your refund policy?"
        }
      ]
    },
    {
      "type": "function_call",
      "call_id": "manualToolCall_6p9u46",
      "name": "lookup_context",
      "arguments": "{}"
    },
    {
      "type": "function_call_output",
      "call_id": "manualToolCall_6p9u46",
      "output": "# Refunds\n\nRefunds are issued at our sole discretion if there are any quality issues with the product. Please reach out via the \"Contact Us\" page if you would like to request a refund."
    }
  ],
  "text": {
    "format": {
      "type": "text"
    }
  },
  "reasoning": {},
  "tools": [
    {
      "type": "function",
      "name": "lookup_context",
      "description": "Search the vector store for context related to the latest message.",
      "parameters": {
        "type": "object",
        "properties": {},
        "additionalProperties": false,
        "required": []
      },
      "strict": true
    }
  ],
  "temperature": 1,
  "max_output_tokens": 2048,
  "top_p": 1,
  "store": false
}'```
Python
from openai import OpenAI
client = OpenAI()

response = client.responses.create(
  model="gpt-4.1",
  input=[
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Hello! What is your refund policy?"
        }
      ]
    },
    {
      "type": "function_call",
      "call_id": "manualToolCall_6p9u46",
      "name": "lookup_context",
      "arguments": "{}"
    },
    {
      "type": "function_call_output",
      "call_id": "manualToolCall_6p9u46",
      "output": "# Refunds\n\nRefunds are issued at our sole discretion if there are any quality issues with the product. Please reach out via the \"Contact Us\" page if you would like to request a refund."
    }
  ],
  text={
    "format": {
      "type": "text"
    }
  },
  reasoning={},
  tools=[
    {
      "type": "function",
      "name": "lookup_context",
      "description": "Search the vector store for context related to the latest message.",
      "parameters": {
        "type": "object",
        "properties": {},
        "additionalProperties": False,
        "required": []
      },
      "strict": True
    }
  ],
  temperature=1,
  max_output_tokens=2048,
  top_p=1,
  store=False
)
node.js
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await openai.responses.create({
  model: "gpt-4.1",
  input: [
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Hello! What is your refund policy?"
        }
      ]
    },
    {
      "type": "function_call",
      "call_id": "manualToolCall_6p9u46",
      "name": "lookup_context",
      "arguments": "{}"
    },
    {
      "type": "function_call_output",
      "call_id": "manualToolCall_6p9u46",
      "output": "# Refunds\n\nRefunds are issued at our sole discretion if there are any quality issues with the product. Please reach out via the \"Contact Us\" page if you would like to request a refund."
    }
  ],
  text: {
    "format": {
      "type": "text"
    }
  },
  reasoning: {},
  tools: [
    {
      "type": "function",
      "name": "lookup_context",
      "description": "Search the vector store for context related to the latest message.",
      "parameters": {
        "type": "object",
        "properties": {},
        "additionalProperties": false,
        "required": []
      },
      "strict": true
    }
  ],
  temperature: 1,
  max_output_tokens: 2048,
  top_p: 1,
  store: false
});

I’m not sure where call_id is supposed to come from. I presume you can just make something up.

EDIT: Remember to set tool_choice to none! I forgot to do this.