How File Search Works and Pricing

Is file_search used even if the prompt doesn’t require it? It could be a simple question, will I still be charged for it?

Could I have an elaboration on how $0.10 / GB of vector-storage per day (1 GB free) works? Are we charged for storage of files?

Or each usage of file_search from obtaining a response? If not, what’s the charge for this? Is there a standard value or its based on context searched?

Is every single file searched or can it intelligently minimize the number?

The cost for vector storage is $0.10 per GB per day, with the first GB being free. This means if you store files that collectively use up to 1 GB of vector space, there is no charge. Any storage beyond this free 1 GB is charged at $0.10 per GB each day. For example, if you store 2 GB of data, you will be charged $0.10 per day for the additional 1 GB​ (OpenAI Help Center)​​ (OpenAI Developer Forum)​.

Yes, if file is not search, normal price will be charged depending on the model you use.


Thank you, clarified a lot!

I have 2 questions, for context, a user interacting has following prompts:

User: Hi, what’s the color of apple?
AI: … normal response

User: query requiring file search
AI. responds after file search

  1. Is the model able to identify first prompt requires no file search?
  2. Suppose I have 10 files and 1 of them contains the answer, am I charged for every file it had to search, accounted as input-tokens?
1 Like

When using the file search tool, you are primarily charged based on the vector storage used by your files, not for each individual search operation. The tokens used during retrieval operations are considered as part of the overall input tokens for the request.


  1. Vector Storage Cost:
  • You are charged $0.10 per GB of vector storage per day, regardless of how often the files are accessed.
  • The first 1 GB of storage is free​
  1. Retrieval and Input Tokens:
  • When a search operation is performed, the assistant retrieves chunks of data from the files.
  • The total number of tokens used in these chunks counts towards your input tokens for that request.
  • If multiple files are searched to find the relevant information, the tokens from all these files’ chunks are included in the input token count​.
  1. Your Scenario:
  • If you have 10 files and only 1 contains the answer, the assistant might need to read through parts of multiple files to find the relevant one.
  • The tokens from all the chunks read from the 10 files will be counted as input tokens.

I hope this helps answer your question.

1 Like

Yes thanks a lot for your help!