When are 400 errors thrown by the Answers endpoint?

stevet · April 27, 2021, 1:30pm

I think I know the answer to this question but just want to confirm. It seems the 400 behavior is different when using a file vs the documents parameter when calling the Answers endpoint.

For example, when I use a file and ask a question with an answer that can’t be derived from the documents the API throws a 400 error. However, if I make the same request, with the same question, and the same documents, but provide the documents using the documents parameter, I don’t get a 400 error. Rather, I get an answer that could be way outside the scope of my documents.

My assumption is that this is because the “two-step search procedure” is being used with the file and not the documents sent via the documents parameter.

Can someone confirm or clarify this for me?

hallacy · April 27, 2021, 2:00pm

Hi @stevet!

I think you basically nailed it. That said, it sounds like there might be a few things going on here. I see two possible cases:

(1, most likely) When a user sends a request to /v1/answers with the file parameter, we’ll try and filter down the number of possible results down to max_rerank. Right now that filtering process is keyword based so it’s decently easy for nothing to come back from this step. In that case, you should get a 400 error back with a message to the effect of “no documents found” or something like that.

If you submit a request with the documents parameter set, we assume you want those examples reranked and so we don’t do a filtering step. We just immediately rerank them.

(2) That 404 is bugging me a little bit. It’s not quite the status code I’d expect for a file that didn’t return results. Can you post the error message you’re getting and maybe the api call you’re making?

stevet · April 27, 2021, 2:12pm

Thanks for the quick reply, @hallacy! The error is actually a 400 error, not a 404 (my bad). So the error message is being included in the response payload. Here is the response payload I’m getting.

{
  error: {
    code: null,
    message: "No similar documents were found in file with ID 'file-oi6E5kMlnYSb0Zx05WmSKJSX'.Please upload more documents or adjust your query.",
    param: null,
    type: 'invalid_request_error'
  }
}

It can be recreated using the example data from the documentation page which is what I used when I originally noted the behavior. Here are the contents of the .jsonl…

{"text": "puppy A is happy", "metadata": "emotional state of puppy A"}
{"text": "puppy B is sad", "metadata": "emotional state of puppy B"}

Lastly, here are the request parameters I used.

{
  "file": "file-oi6E5kMlnYSb0Zx05WmSKJSX",
  "question": "How old are you?", 
  "search_model": "ada", 
  "model": "curie", 
  "examples_context": "In 2017, U.S. life expectancy was 78.6 years.", 
  "examples": [["What is human life expectancy in the United States?", "78 years."]],
  "max_rerank": 10,
  "max_tokens": 5,
  "stop": ["\n", "<|endoftext|>"],
  "return_metadata": true,
  "return_prompt": true
}

It makes sense to me now. So, basically, if there isn’t a keyword match, you’ll get a 400 error. Correct?

hallacy · April 27, 2021, 4:21pm

Bingo. We’re working on updating that filtering step to make it more robust in the future. If you have ideas I’m all ears!

stevet · April 27, 2021, 6:35pm

Thanks again. I don’t think it’s bad now - I mostly just wanted to understand it. The keyword filtering seems like a good way to filter down the number of possible results. That said, it might be nice if there was a way to “seed” a keyword list to create a broader context if needed. So, not just from the question. Maybe another parameter with an array of keywords. Does that make sense?

hallacy · April 27, 2021, 10:24pm

Oh interesting! I hadn’t thought of that. And that way you can expand the search list but it won’t interfere with the query?

stevet · April 27, 2021, 11:27pm

Exactly. Also, with that, you could run your own classification task for the question to match pre-defined keywords that could then be used for the request to the Answers endpoint.

Foxalabs · July 22, 2023, 8:34am

Summary created by AI.

The forum discussion is about the difference in behavior of the Answers endpoint when using a file versus the documents parameter. Steve noticed that a 400 error is renedered when a question is asked that can’t be answered from the documents in the file. But when the same documents are passed via the documents parameter, no 400 error is thrown. Hallacy confirmed this behavior, explaining that when a request is sent with the file parameter, results are keyword filtered down to max_rerank. A 400 error (not 404 as initially mentioned by Steve) means no similar documents were found. When the documents parameter is used, no filtering occurs—documents are directly reranked. The 400 error message can be seen in Steve’s response payload. He proposed a feature to seed additional keywords for a broader context during document search. A pre-defined keyword match could be run for the question before sending a request to the Answers endpoint—something Hallacy found interesting. ref

Summarized with AI on Jul 22
AI used: gpt-4-32k

Topic		Replies	Views
Do the answers endpoint actually use the metadata? API	10	1963	July 23, 2023
Get a correct /answer with fine-tuning API	4	566	January 25, 2022
Responses API file_search tool - issues and guidance API rag , file-uploads , file-search , responses-endpoint , responses	4	388	April 5, 2025
Ranking / Scoring documents in Question Answering API	9	2245	January 30, 2024
Semantic search using uploaded files (only performs lexical search for me) API	19	2440	January 30, 2024

When are 400 errors thrown by the Answers endpoint?

Related topics