File_search with max num results

Hi,

It looks like the file search tool have a max num result restriction limiting how many matching result it should search for. I am actually providing the data to my domain context that if user ask for a specific datapoint there may be lots of data for that datapoint and I want all data to be used to contribute to the response. Is this possible?

I see a file input api here to upload pdf files and the model would be able to process and understand the file content. I would like to do similar thing but with json file, is that possible or potentially a coming soon feature?

I am aware of the file search tool with vector store but that seems more like performing a similarily search which isn’t exactly what i need. I need a way of uploading loads of data to the llm for it to understand context and data and answer user questions

https://platform.openai.com/docs/guides/pdf-files

You have a good question, as this file_search tool and the vector store database is offered without a whole bunch of information about the applications it is useful for.

However, the application you might be thinking about is a bit unclear, but it sounds like this might not be the right solution for you.

The way it works:

  1. Documents are uploaded to the file storage.
  2. Then you create a vector store. One of the parameters of a vector store is how many tokens that the data resulting from document extraction will be split into. The default is 800 tokens, perhaps 500 words per chunk.
  3. Then the tool inclusion in an Assistants or Responses endpoint AI call makes that file search available for the AI to call, with search queries.
  4. The AI doesn’t know what kind of knowledge is actually in the vector store unless you explain when the file search is useful, and what the expected results are.
  5. OpenAI uses language in multiple places indicating that it is the user that “uploaded files”, and that their knowledge is in the file search - not the developer.
  6. The AI can call the tool with a search query, such as “company address”
  7. A semantic search is done, which ranks the chunks and returns the top results as a tool result block of text.
  8. The AI can then treat this like knowledge, and answer from information retrieved across documents

Here are the limitations:

  • a maximum of 16000 tokens will be placed, regardless of how big you make chunks or how many top results you specify
  • the parameter for the max results simply doesn’t work - you get 20 even if you only wanted to pay for 5 chunks per call
  • these are entire sections of documents, across all documents in a vector store, and it is only the similarity of tops that brings documents to prominence.
  • tabular data, such as JSON or Excel files, are not permitted - they simply will not work good. Some part of the middle of a JSON or data from Excel without headings or with troublesome document extraction will just be poor, and there is no “search-like” quality when you’ve got 20-50 key/values per such a chunk.

So: if you do have little chunks of knowledge, like customer data or fine-grained results that must be returned individually, you would need to build your own function that can act more like a SQL query, with more parameters to drill down into the type of data you offer.

With a function of your own that does a search, you can actually provide a good description of how to use the function, what it will return, and then make many query fields for the AI to use, such as names, date ranges, other types of metadata. You can also then budget how much you want to return to the AI.

Although the question you ask about the file search tool likely has a bigger question behind it, such as “how do I build this application”, I hope that offers some clarification what OpenAI offers - a generic knowledge base from readable documents.

I think you’re right on that this file search/pdf file input may not be what I need. Is there documentatin on how chatgpt handle their file upload feature because I would like to achieve similar outcome.

Essentially my goal is to allow user/system to upload large amount of data/context for the model to process and has access to/understand, then the model will use such information to answer user’s questions. Sending the data directly will quickly cause a run out of token limit, file search doesn’t work well as it’s performing a semantic search rather than understanding the context and answer based on all the context. Looks like I’m running out of options here, is there any other recommendations?

The GPT-4.1 series of models has a larger context window. It can handle up to 1 million tokens.

However, you will pay for all the input tokens you use.

This larger context window lets you include large amounts of text directly as messages. The model can then answer questions about that text without needing additional searches. It can also summarize the text fully because it sees the entire content at once.

If you have a request like, “Here’s everything on my hard drive—what did I say to dad in February?” you will still need some kind of search tool. This tool could be AI-powered or a simpler keyword or graph search. Its job is to narrow down the large amount of information to only the relevant parts that can answer your question, the amount that the AI could understand and could actually be placed within a budget.

If you want users to attach files, you can limit the size of each attachment. Another option is to allow text extraction from attachments only up to a certain size. This approach fits the common user pattern, where people attach documents and ask specific questions about them.