Say you want the model to read a local text file and then do something with it. So you must add the content of the file to the conversation history.
But do you add it with the ‘System’, the ‘Assistant’ or ‘User’ role? I think that ‘User’ would probably be a bad choice since then the model would subsequently assume that the ‘user’ was manually inputting it or at least has knowledge of the content text file.
System seems like the most appropriate choice to me, since it’s not the model itself either that’s retrieving the information, but I don’t know if that really coincides with the intended purpose of the ‘system’ role either.
What do you think about it?
Edit: Thanks for your thoughts. I think that indeed just using the system role does work so far. My concern is that using it in such an inflationary way may end up confusing the AI especially with regards to sticking to the original system message at the start of the chat. So I think it may indeed pay to experiment and try out a bit starting e.g with @_j 's approach
I’m guessing you’re using GPT as a user-facing conversational agent?
If so, agreed. The system message is ideal (imo). When it fits, it fits! Even better to have it known as potential context to be used in it’s response.
It almost feels like muddying the conversation when using the user role.
But some people say otherwise and even the OpenAI cookbook includes inside of the user message.
The plugin model uses the tool role for plugin generated information.
I’ve not tried setting the role to tool in the API—yhe OpenAI documentation says only system, user, assistant, and function are supported. But, I don’t think it would have any substantive effect even if it works.
I would probably inject it into the system message.
#tools, as a category where functions are inserted by text injection, is two carriage returns, the tools with the hashmark, two more carriage returns, and then ``##functions`, inserted at the end of the system prompt. However, that prepares the AI to use those definitions, not necessarily where information should go.
AI can easily understand a category of your own such as ##documentation that has been just used as a user role. A typical use that I’d do for coding, is just say “here’s some documentation for the code I’ll have you write, just accept this into your conversation history and wait for more instructions”, however if you are the programmer and priming the AI for answering, it seems a bit odd for the user to supply information that then can directly be used for then answering the same user’s questions.
I think the most direct way is another virtual “assistant” role message that is inserted right before the user input that brought about that database lookup, and you can give it a bit of introduction “database lookup retrieval knowledge for fulfilling the following user question”.
When inserted right before the user question that returned the embeddings data with such a description, the AI can then understand that it is not part of active conversation, and then the inserted text also doesn’t appear far back as a system message at the start of chat history, seemingly unrelated.
Hey thanks for your time and your thoughts. I think I like that suggestion a lot. In a way maybe it’s also kind of appropriate to label it as the assistant role after all since you can imagine it as the AI “reading and memorizing it” so it’s like the text does become part of its internal monologue.
I wasn’t aware of these “categories” or that you could introduce categories by leading the text with “##[category]”. Is that some protocol defined by OpenAI or just something which the AI picks up on naturally?
It’s not really a “protocol”, but a way that the model for function calling (and you need to have a function to get the function-trained AI) has been shown to receive and understand predefined functions.
The functions that are given to the AI by API are inserted to the end of the first system role message, and the AI has been shown many examples of how that specifically means that it should produce specific outputs based on the specification. It also doesn’t need any function (other than a dummy function to ensure you get the model) for it to even call a python function without it actually being specified (the training of “code interpreter”).
I haven’t extracted how plugins are inserted into ChatGPT for AI understanding.
If we look at simply a “tools” category (under which functions appear) as a type of pretraining the AI has, than we can also make a #database or #documentation category or other such parts of the system prompt message and assume that, even though we just fabricated those sections, the AI will also be predisposed to see these as some sort of special guidance for its operation.
Using another assistant for context delivery is interesting. I haven’t tried it myself. My concern is similar to using the user role. But if GPT is properly trained to “understand” that there can be multiple types of assistance’s in a conversation, that’s really neat and I would love to do it myself (makes sense)
What you’re saying about the system message is interesting as well. It always seemed that OpenAI intended the system message to be used strictly for setting the initial environment (implied by the playground where it covers the complete conversation and cannot be repeated).
Yet when it was first released myself and others found success appending the system message (along with the initial message) to the end of the conversation for context retrieval. But, well, that was a couple months ago when it was all new.