when switching from GPT4o to GPT4o-mini retrieval almost stops working. I have a vector store attached to my assistant, it just seems that gpt4o-mini ignores the instructions to always retrieve context whereas gpt-4o doesn’t.
I submitted a similiar problem yesterday so let me jump on yours to try to bump the problem. My sense is that gpt-4o-mini is treating the vector search tool as a function tool and not running it internally. Attached is a picture of the playground where the problem is demonstrated.
I was just trying different prompts on the playground. I cannot make GPT4o-mini always retrieve using the instructions but I had some success by formatting the user message and adding ‘answer by retrieving context’ at the start or end of the user message.
Doing so makes retrieval work most of the time but not always. If the model refuses to retrieve once then any subsequent message will also not retrieve. Starting a new thread will make it retrieve again.
I’m having the same issue when migrating from GPT 3.5-Turbo-1106 to GPT 4o mini.
I attach my Vector Store to the thread for each new conversation. With GPT 3.5 it was working fine and running retrieval to fetch the answer from the knowledge base, but with GPT 4o mini it never uses it.
I tried to update the Assistant instructions asking it to use the retrieval tool but didn’t work. It only worked withe the @anon-dev-72 suggestion above, but it doesn’t seems much reliable.
These models are just pattern recognizers so once they see themselves following a pattern (like not calling a tool) they tend to follow that same pattern on subsequent turns.
You’re on the right path with adding instructions to your user message. For smaller models like gpt-3.5-turbo and gpt-4o-mini you want your instructions to be as close to the end of the prompt as possible so making them part of the user message will yield the best results.
One tip though is to make your instruction an actual instruction and not a request like you have it now. The smaller models respond better to an active voice versus a passive voice. So a better way to structure your user message is like “call xyz tool to retrieve context and then answer this question: {question}”
The more specific the instruction the better so ideally you’re telling it the exact name of the tool to run. How do you find that name? I’d start with just asking your assistant “what’s the name of the tool you use to retrieve context?” This worked for me when assistants first launched (I don’t use assistants anymore) so it will likely just tell you the name of the tool.
The other thing you might try is to pair the instruction in your user message with a similar instruction in your system message like “always call xyz tool to retrieve context before answering a question” again these models are pattern matchers and instructions are just patterns to them. The more times they see an instruction and more importantly, the more they see themselves following an instruction, the more likely they are to continue following the instruction.
Explicitly mention to use file_search when searching for files. Also, it tends to use msearch when it fails to retrieve files. If so, return ‘wrong tool used, use file_search instead’ this fixed it after a long search