We’re running into an issue where our assistant is sporadically failing to use File Search to respond to a prompt. We haven’t quantified the frequency that happens, but several times a day we have users who report incorrect responses from our assistant. When we look at the run steps, there is no indication that the retrieval step happened.
It also seems to happen persistently for a short period of time (e.g. several prompts in a row), and then resolves and it’s working consistently for a period of time, which has lead us to wonder if there’s a internal issue/outage happening during these times. Although the absence of a run step that failed might rule this out.
Is there any additional information about how the assistant makes decisions about when to use tool(s) such as file search? Any suggestions to troubleshoot this?
This seems to happen to folks sporadically. It helps to specify that your Assistant run a retrieval from this-or-that vector store or file before it answers, either in it’s Instructions or the User Prompt.
The fundamental issue is that the myfiles_browser tool that you are told is “file search”, has a description where the AI receives “the user has uploaded files”
If the user doesn’t talk about “their uploaded files”, then it is unlikely to be employed.
The coverage and utility of the information behind the tool also can’t be added directly to this tool description that OpenAI has placed, either.
For your own function, you might write “performs a semantic search based on query you write, returning the most similar documentation from the pinball machine troubleshooting database to what was written by AI.” Not this tool, though.
Therefore, you must write instructions talking directly about the tool and its msearch method, countering the tool’s counter-productive guidance, and clarifying that the files are part of the application and task that the AI performs, that it is not “files” but a knowledge database, then what the AI will find when it searches, and the scenarios where it is mandatory to answer from that knowledge.
Then, better application performance will come by recommending in your interface a thread be abandoned before it grows to the maximum, because the placement into a thread of 10k+ tokens with every invocation itself distracts from following instructions far before.
You can see if “a preliminary answer must ALWAYS be sent to myfiles_browser recipient before producing any answer directly to a user.” directive reveals if the vector store is having problems (like earlier today) or if it is AI’s understanding.
Quick update: The original issue I reported above was an assistant using 4o-mini. We experimented with switching to 4o (non-mini) and the problem seems to have gone away in our testing thus far.
So perhaps 4o-mini is significantly worse at tool/function calling?