Should rag retrieved documents be sent as system or user messages?

Interesting link, thanks for sharing!

Another possibility would be to use the optional name property of user or system messages to to distinguish between questions/instructions and provided data. But I imagine that current models will be mostly trained on data in tool messages. It would be interesting to run some systematic evals about this.

1 Like

Very interesting. Do you mind explaining this part with other words:
"OpenAI wants to take away creative use with the enforcement of “tools” ?

Does this “Then assistants makes RAG even more useless, with the messages being persistent and blowing up in size.” mean that you are criticizing how Assistants API carries the old RAG data in converation history (until it’s trimmed due to being full)? So it adds costs.

With the role “function”, you can just place them anywhere in the list of messages you want, with whatever content is desired.

However, “tool” cannot be used in such a manner. The API strictly enforces that an “assistant” role response must make tool_call with an ID number, and then immediately after a “tool” role response has the same ID numbers. Otherwise you get an API error. It also would be mostly ignored by the AI if it was not placed AFTER the user question.

You can expand the code above where I procedurally build readable system, user, tool_calls, and tool return as the messages to send.

Assistants maintains in a thread all tool calls it made to its own myfiles_browser search (which is file_search). The same as if it calls tools in your code. myfiles_browser can return 20000+ tokens into a thread and doesn’t care if the results from files are completely irrelevant to the query. The only reasonable way to provide transitory knowledge is by placing the text in “additional_instructions” of a run. More reasonable is not to use Assistants if you want anything better than ChatGPT.

2 Likes

The constraint you mention, introduced when moving from function calls to tools calls applies also for chat completion Api, not only assistant api …
My assumption was that it was required since when moving to tools calls , ai could request to invoke more than one tool in the same reply … so required to match each tool call to the corresponding reply

1 Like