Does `context token` including the uploaded file in Assistant messages?

pcpliu.dev · December 29, 2023, 6:39pm

Hi, I’m wondering if someone is familiar with the Assistant billing setup can help me understand its calculation of context token

So yesterday I was testing with an assistant with knowledge retrieval in playground.

I created an assistant. Then I asked about 5 or 6 questions, for each question, I attached a text file (each contains around ~4k english words). I did some quick summary-ish work and done. Then I deleted all assistants and files.

Later, I checked the billing, there’re like 170K context tokens and 1k generated tokens.

Wondering what exactly were those 170K context tokens? Does assistant reading files count as context tokens?

A little bit confused here. I think Im asking how exactly I can estimate my costing when dealing with assistant and files.

Thanks!

pcpliu.dev · December 29, 2023, 6:41pm

More context. In those assistant responses, i saw they had reference (raw text) to the files. Does those count as context tokens?

anon22939549 · December 29, 2023, 11:00pm

Anything sent to the model is counted.

If the model retrieves text from a file that is context and it is billed accordingly.

Tarmenale · January 13, 2024, 10:01am

I have similar question related to costs. I’ve noticed that when I will create Assistant with files then ‘context token’ gets really high. So my question is: are files read once per thread or each message is poisoned by ‘files content’ so that everytime I ask question in same thread it generates high ‘context token’ cost?

Example:

Let’s assume we have file which has 10k tokens
I’m creating thread
I’m creating a message with 100 tokens
I run Assistant
I will get 10.1k tokens cost for ‘context token’ and X amount of tokens for produced result (this part is clear)
Then I will write SECOND message in existing thread which will contain also 100 tokens.

Question: how much second message will cost context tokens?
a) 10.1k = 10k which come from file + 0.1k which come from SECOND message
b) 10.2k = 10.1k which come from whole thread conversation + 0.1k from SECOND message
c) 0.1k = 0.1k comes from SECOND message because model already charged for files and reusing historical data does not generate costs
d) something between 0.1k and 1.0k = SECOND message costs 0.1k and there is an algorithm which will generate some unknown cost based on historical messages (in short: gambling cost)
e) it works totally different - nobody can explain how tokens are calculated so that users must be ready to pay 128k tokens per request in worst case as this is maximum capacity of GPT-4.

mb0 · March 26, 2024, 10:12am

thanks for detailed case - my experiments also showed high token over-usage and so seems b) - the answer for your Question
My case: Real context sharing by assistant within thread - API / Feedback - OpenAI Developer Forum

Topic		Replies	Views
Context tokens in Assistant API API assistants-api	2	2040	February 20, 2024
Assistant API - What are Context Tokens in the Billing calculation? API assistants	24	12060	May 6, 2024
Why are my context tokens used so quickly? API api	3	2718	January 5, 2024
Do Assistant-called function outputs count towards input tokens? API	8	1957	January 12, 2024
Assistants API context tokens Number API assistants-api	4	959	December 4, 2023

Does `context token` including the uploaded file in Assistant messages?

Related topics