Is file_search used even if the prompt doesn’t require it? It could be a simple question, will I still be charged for it?
Could I have an elaboration on how $0.10 / GB of vector-storage per day (1 GB free) works? Are we charged for storage of files?
Or each usage of file_search from obtaining a response? If not, what’s the charge for this? Is there a standard value or its based on context searched?
Is every single file searched or can it intelligently minimize the number?
The cost for vector storage is $0.10 per GB per day, with the first GB being free. This means if you store files that collectively use up to 1 GB of vector space, there is no charge. Any storage beyond this free 1 GB is charged at $0.10 per GB each day. For example, if you store 2 GB of data, you will be charged $0.10 per day for the additional 1 GB (OpenAI Help Center) (OpenAI Developer Forum).
Yes, if file is not search, normal price will be charged depending on the model you use.
When using the file search tool, you are primarily charged based on the vector storage used by your files, not for each individual search operation. The tokens used during retrieval operations are considered as part of the overall input tokens for the request.
Breakdown:
Vector Storage Cost:
You are charged $0.10 per GB of vector storage per day, regardless of how often the files are accessed.
The first 1 GB of storage is free
Retrieval and Input Tokens:
When a search operation is performed, the assistant retrieves chunks of data from the files.
The total number of tokens used in these chunks counts towards your input tokens for that request.
If multiple files are searched to find the relevant information, the tokens from all these files’ chunks are included in the input token count.
Your Scenario:
If you have 10 files and only 1 contains the answer, the assistant might need to read through parts of multiple files to find the relevant one.
The tokens from all the chunks read from the 10 files will be counted as input tokens.
You mentioned, “the tokens from all these files’ chunks are included in the input token count.” Could you please clarify if you mean that all chunks from the vector store are included in the input token count, or only the retrieved chunks that have similarity with the user question?
The statement refers to the retrieved chunks that have similarity with the user question. When you perform a search, the File Search tool retrieves relevant chunks from the vector store based on their similarity to the user’s query. Only these retrieved chunks are included in the input token count for generating a response, not all chunks stored in the vector store.
You incur maintenance fees for vector store space exceeding 1GB.
You are charged in AI language input tokens when the AI processes the output from this tool to respond, typically 20 chunks of 800 tokens or fewer being placed into the model context thread.