I am using the assistants api with file retrieval and code interpreter tools. The assistant will frequently mention the files by name. I want it to act like it can just see the data and not explicitly reference the files, or talk about re-importing the files etc. I know this has been mentioned on the forum before but there doesn’t seem to have been any working solution, and wondering if any one has found a solution since.
To be clear, I don’t mind the annotations to files that are sometimes added to the response, those are easy to deal with. What I am talking abut is when the assistant explicitly references the files in the text as in: “It seems that the session reset has caused a loss of import statements. Let me correct that and re-import necessary modules to process the files.”
I have tried numerous prompts along the lines of "don’t mention the files, the user should be unaware of the existence of the files. " to absolutely no avail.
I know that I could run the response through another LLM to rewrite the response in a way that doesn’t mention the files, however I am using the streaming API as well, so this isn’t really an option.
It’s pretty much impossible to stop the AI from producing irrelevant recitations about files, or to stop free reign over them from an unprivileged user, even those files that are meant to be internal, when OpenAI has damaged Assistants and Responses in this cleverly droll manner:
The most straightforward way is to make direct use of vector stores - and not even have a search tool, just use input context RAG injection based on the context of what is being asked and some rewriting.
A new guide just for you:
Of course better is to use embeddings yourself - but also a higher hurdle than a service provided by the same provider as you language model.