There seems to be an issue with how tokens are calculated within the thread

UXsniff · April 9, 2024, 6:38am

This is how assistant API calculate tokens usage within a thread using function calling.

User:
Tell me about my site today

Assistant:
Calling function… (JSON as return)

Assistant:
return response.
(6K tokens)

User:
do you remember the information in your previous message?

Assistant:
Yes, I remember the details from the previous message. How can I assist you further with that information?
(6K tokens)

User:
If I ask a similar question again, will you call the function again to retrieve data, or use the information obtained from the previous message?

Assistant:
If your question is closely related to the information already retrieved and discussed in the previous message, I would use the same information to answer your question without calling the function again. This helps provide quick and efficient responses based on the most recent data available. However, if the question requires new or updated data that wasn’t previously retrieved, I might need to call the relevant function again. How may I assist you further?
(6K tokens)

User:
OK tell me the total traffic from US, UK and India

Assistant:
Based on the information from the previous data:
…
(6K tokens)

User:
is the sky blue?
(6K tokens)

What do you think? Is it fair enough to calculate the tokens this way? I have multiple functions available and some functions might return 100 - 200kb JSON.

Topic		Replies	Views
Assistant API token Usage - promt_tokens usage is too high API api-usage , assistants , assistants-api	8	1874	April 10, 2024
Token Count: Playground vs Tokenizer GPT builders token , pricing , assistants , assistants-api , assistants-pricing	10	1528	February 3, 2024
Do Assistant-called function outputs count towards input tokens? API	8	2011	January 12, 2024
Token Optimization for Assistants API - Excesive token count API gpt-4 , assistants , assistants-api	2	2706	May 24, 2024
Pricing of Assistant API misleading API	1	2044	December 11, 2023

There seems to be an issue with how tokens are calculated within the thread

Related topics