High Costs and Input Tokens with Assistants API File Search

zejiran · May 10, 2024, 8:11pm

I recently conducted an experiment to understand the impact of file size on costs when using the Assistants API with File Search enabled. My findings were surprising: the costs didn’t consistently correlate with file size. I used files of less than 300 kb. This suggests that factors other than file size, possibly internal API processes, are influencing the costs significantly.

During a basic conversation involving three user messages (8 messages in total on the thread), around 80k input tokens were consumed to produce just 400 output tokens, with the vector store attached to the thread totaling 404560 kb. These inconsistencies in token usage and cost calculations are perplexing.

Despite only interacting with a small portion of the file in my tests, the costs were unexpectedly high, raising concerns about the practicality of deploying this feature in user-facing applications.

Furthermore, in a simple test with the File Search tool activated, a basic greeting like “Hello” followed by “Question me” without any uploaded file should account for only 4 input tokens according to the tokenizer tool. However, the API reported 1874 input tokens and 50 output tokens used on the playground, even though no files were uploaded and my current assistant instructions were just 338 tokens.

These anomalies in token usage and cost calculations are puzzling and potentially prohibitive for cost-effective application. I’m reaching out to the community for insights or suggestions on how to better manage or understand costs associated with File Search in the Assistants API. Any shared experiences or tips on this would be greatly appreciated.

amanqazi1818 · May 27, 2024, 11:42am

Having the same issue! thinking of just scarping the assistants idea and just using langchain with embeddings instead

EduGPT · May 27, 2024, 1:40pm

Same, 18K tokens per call. But for us that’s still cheap.

I don’t know what they did to file search, but now is very good in playground at least. Find hard answers in many files.

zejiran · May 29, 2024, 11:52am

Update: I repeated the same experiment using GPT-4o, now seems to use near half of the tokens than before (~40k).

dvala453 · October 31, 2024, 6:02am

facing same issue. had enough research on it but as it is in beta not much documentation I got but one thing I noticed is although gpt-4o taking high input tokens it cost is lesser as compared to other model such as gpt-3.5-turbo and later. refer this pricing doc for the same. https://openai.com/api/pricing/

let me know if anyone have good insights on it. really needed.

Topic		Replies	Views
Using Assistant API GPT-4o with File Search enabled automatically ups the tokens used by 3.5k Bugs api , assistants-api , gpt-4o	2	937	June 27, 2024
Strange Assistants Pricing API assistants-api , assistants-pricing	1	572	May 7, 2024
GPT-4o / GPT-4 API pricing differences when using API/Playground API gpt-4	4	10690	May 23, 2024
Assistants API Cost Exceeds Reasonable Expectations API gpt-4	4	1049	April 11, 2024
Unexpectedly High Token Consumption in OpenAI Assistant API	1	111	March 11, 2025

High Costs and Input Tokens with Assistants API File Search

Related topics