Using too many tokens for "incoming" requests

gardetsky · February 14, 2024, 11:14am

While my Python program is making its requests (user prompts) to Assistant, it is using nearly 500 tokens each time even if the prompt consists of 8-15 words.
And with each new few-words-prompt it increases “In” tokens for around the same amount (~500) (screenshot Nr3)
I understand that in case of one Thread, the model “reads” the previous texts in current chat, and that’s okay. But it seems those 500 “In” tokens are always adding.
Does anybody know what might be the issue here?

Foxalabs · February 14, 2024, 11:24am

If you are making use of the retrieval feature, then the assistance engine may use any amount of tokens up to 128k as context to answer the query, so I imagine it is that function that is creating the initial 500 tokens of usage.

gardetsky · February 14, 2024, 12:00pm

Thank you. What does it mean “use of the retrieval feature” ?

Foxalabs · February 14, 2024, 12:18pm

If you have uploaded documents that should be used when answering questions then you are making use of the retrievals feature.

gardetsky · February 14, 2024, 12:25pm

I have no uploaded documents for this Assistant.

Topic		Replies	Views
Why are my context tokens used so quickly? API api	3	2800	January 5, 2024
Assistant api using too much tokens Prompting assistants-api	0	954	January 30, 2024
How many tokens is normal usage for asking a question? API chatgpt	7	13664	September 6, 2024
Beginner question on Input Tokens Prompting gpt-35-turbo	2	425	January 22, 2024
Assistant API - consumes too much prompt tokens. What is the reason and how can I reduce it? API assistants , assistants-api	4	479	August 19, 2024

Using too many tokens for "incoming" requests

Related topics