The search documentation states this:
“File-based search is a two-step procedure that begins by narrowing the documents in the provided file to at most
max_rerank number of documents using a conventional keyword search.”
Can anyone elaborate on what is meant by “keyword search”? It must be something more sophisticated than “exact match” but less sophisticated than the semantic re-ranking occurring in the second step.
If nothing is found during the keyword step, no re-ranking is performed, correct?
Can the problem in #2 be avoided by specifying max_rerank larger than the total number of documents?
For context, my use case involves documents with highly complex, technical language, so there could be a relatively high proportion of instances where nothing is found from a user’s query based on a keyword search. That’s exactly why I need the semantic capabilities of GPT-3 – not to be hamstrung by keyword search. See the irony here?
I understand that there are compute costs that make some constraints necessary. Would love to discuss the future of search in more depth with the OpenAI team.