Clarification on "max_rerank"

I need some clarification what “max_rerank” means.

If I have 1,000 documents, and I perform a search. And I set max_rerank to 5. Does it mean that

a) all 1,000 docs get searched (i.e. compared to my query), but only the top 5 matches get returned, or

b) only 5 docs get searched, ranked and returned, ignoring the 995 other documents?

I think I am partly confused about this because I don’t understand your cost structure. I would guess that your engine, in order to spit out even just the one best matching result, needs to “index” or “compare against” all documents. If that is so, shouldn’t it be fairly irrelevant for your endpoint whether it spits out just the #1 result, or 200? The scores for all documents must have been computed anyway.

Thanks for clarifying!

1 Like

Hi, thanks for the question. The search endpoint can work as a two step process, which is especially useful if you want to search over very large numbers of documents. If you use max_rerank=5, and have 1000 documents, then conventional keyword search is used for selecting 5 most relevant documents, and then GPT-3 is used for scoring each of these 5 documents.

In practice we find this gives a pretty good result whilst being a lot faster and cheaper, than using GPT-3 search for all documents.

More details available here: OpenAI API

1 Like

Good point! I’ll take a note of this use case. For now, you can send 200 documents with a max_rerank=200 five times, which gives you accurate scores for each of the 1000 documents. You can use one of the cheaper (and faster) models, such as ada to keep the costs down.

1 Like

Oh, wow. Thank you, Boris. I had no idea the API would use basic keyword search. I think it doesn’t say that in the docs.

Now I have a much better understanding of how search works behind the scenes. From how I understand it, if I want search done by GPT-3, I’ll have to split my request into 200 doc batches and set max_rerank=200. And it’s going to cost a lot of tokens.

1 Like