Assistant API - consumes too much prompt tokens. What is the reason and how can I reduce it?

javidd · August 16, 2024, 4:26pm

Hello. I am building chat assistant with Assistants API. During day I tested it. And from logs I see that it consumes way too much prompt tokens. More than 5000 usually and with GPT-4 model. With GPT-3 model it consumes 8000 , 9000 or more prompt tokens. My assistant does not have any tools enabled. But when I try it in Playground I see that it does not consume so much prompt tokens.
What is the reason ? I attached screenshot from some run logs

javidd · August 16, 2024, 4:53pm

I think this is because of messages. Because the 1st messages have less prompt usage. But as I sent more messages to assistant it increased in each message. I thought may be deleting messages from thread will help. But it did not.

Munna23 · August 16, 2024, 5:12pm

@javidd - You are on the right track. Conversation history is also passed as to the LLM for context and information retrieval from history. You could say 4 chars ~ 1 token or 75 words ~ 100 tokens, if you delete messages from the thread you could calculate the difference in tokens based on the length of the message and check for the difference. Maybe you could try having a concise system prompt and use techniques like RAG for context based retrieval (if any) to save on tokens. Hope this helps - Cheers!

javidd · August 17, 2024, 11:12am

Actually I have used default options for API. Later I created new thread and checked usage again. As I say by each new message usage increased. I am using just simple prompts. So it is not api configuration or complex prompt problem

javidd · August 19, 2024, 1:13pm

Hi, I read your comment. But the one which you wrote were not the reason. Have you tried assistants api ? I searched over internet, and here in community. I see that many people have this problem. 99% the problem is because of history. And I think OpenAI should add parameter something like history_mode on/off. So that we can use less tokens. Otherwise why to use assistants API if it consumes so much tokens ? I wanted to use because its file reading feature and wanted to make assistant service. May be someone from OpenAi will read my comment and answer

Topic		Replies	Views
Assistant api using too much tokens Prompting assistants-api	0	930	January 30, 2024
Using too many tokens for "incoming" requests API gpt-4 , token	4	810	February 14, 2024
Assistant API token Usage - promt_tokens usage is too high API api-usage , assistants , assistants-api	8	1730	April 10, 2024
Assistant 2.0 Tokens Usage - Usage is too high API assistants-api , assistants-pricing	8	1503	April 30, 2024
Issues with High Token Usage in Assistants API for Chatbot Responses API assistants-api	0	210	May 28, 2024

Assistant API - consumes too much prompt tokens. What is the reason and how can I reduce it?

Related topics