Assistant file search text retrieval

aaron.lutz · July 15, 2024, 2:33pm

Looking for a similar solution, but have not found anything promising.

I managed to get some more info on how file search actually works, but unfortunately this is not documented…

Here’s a quick rundown.

The AI model you specified (gpt 3.5, 4, 4o etc) outputs a search query to the search tool. This looks like this:

msearch([“Search Query generated by the Assistant”])

Then the File Search performs a semantic and keyword search to find the most relevant results. It seemed to me, that before the results are passed to the assistant, they get re-ranked or filtered and only the top most relevant results get passed.

The result(s) look like this:

[
{
“message_idx”: 12,
“search_idx”: 0,
“text”: “Text from the file, i.e the search result. This text is exactly as it is in your source document.”,
“source”: “sourcefile.txt”
}
]

Unfortunately, this is not visible in the logs of the run steps or anything similar, at least I could not find it anywhere. But, I think the results above are maybe what you are looking for. I had to do some multi-step prompts to finally get the model to spit out the search results like this. It would be really helpful if Openai would offer some more documentation on this.

I also posted a thread touching on this topic:

Topic		Replies	Views
Assistant api, retrieval file api is not working Bugs api	44	15108	March 13, 2024
Since 2024-Nov-16 Assistant API returning 'server_error' Bugs assistants-api	19	335	November 22, 2024
Assistants API File Search and Vector Stores API api , vector-db , semantic-search , assistants-files , vector-store	12	2628	October 21, 2024
Assistant not able to access uploaded file API file-uploads	37	19030	May 13, 2024
Playground issues calling function API	42	2403	February 29, 2024

Assistant file search text retrieval

Related topics