Ranking / Scoring documents in Question Answering

I am wondering how to rank the documents when using Question Answering, so I know which document(s) the answers are based on. It is not clear how to do it in the api-reference for answers:

Any ideas?


Hi @Andreas :wave:

The response of a request to the answers endpoint returns a selected_documents attribute. See OpenAI API reference for answers on the right hand side is the sample response.

You may also want to explore return_metadata.

Hi @sps :wave:

It does indeed return selected_documents but when I try it out (with 5 sample documents) they all return, in no particular order without a score, and it is unclear which ones are the relevant ones.

1 Like

That is very intriguing. Given how the answers api first uses a search model and then another engine for completion, it does seem weird that the score isn’t being returned with selected documents.

UPDATE: I tested it on my end and scores are being returned

It seems that the specific response snippet in docs is old.

Oh, interesting! Did you use uploaded documents? So far I have just used “hard coded” documents like in the example:

documents=["Puppy A is happy.", "Puppy B is sad."],

and I get no score…

1 Like

Yes @Andreas, I’m using file instead of documents but it should return scores regardless of whether I use file (to upload large number of documents) or simply use an array of lines using the documents. This keeps getting interesting.

Can you also share the search_model and engine you’re using?

UPDATE: Yes you’re correct scores are being returned when file is used as opposed to documents.

I wonder why would that be the case @staff

I believe the answers endpoint first uses a keyword search to narrow the documents to the top 200 and then re-ranks those, giving a score for each. This assumes you uploaded a file with >200 lines. If you have less than 200 documents, I think perhaps the search/re-rank steps are skpped, hence no score is generated.

1 Like

That would mean that upto 200 documents can be completely consumed/processed by the endpoint for generating answer but when that limit is exceeded (using the only possible way i.e file), it has to get the top 200 (max) semantically similar docs for answer generation. Very interesting.

Thanks @lmccallum

I believe that is correct. And regardless of whether or not a file is used, a maximum of 2048 tokens per line is permitted.

1 Like