Too many input tokens are used by Assistant

_j · November 20, 2024, 10:59am

Yes, tool returns are also maintained as past conversation in a growing thread. With no option presented to expire or delete these hidden turns.

Unless you specifically tune the parameters, you will get maximum return from the file search tool even if documents are of zero relevance, and maximum loading of billed tokens at every internal turn.

This post has clearer documentation of the file_search ranker, where you can set a similarity threshold so that simply unrelated document junk isn’t maximizing the cost:

You can start at 0.40-0.50.

There is also not a token limit or a budget for you to set, an internal parameter obviously necessary to not exceed model capability, not exposed to you, but you can limit the number of past turns with the run truncation_strategy parameter.

Topic		Replies	Views
Why are my context tokens used so quickly? API api	3	2862	January 5, 2024
Assistant Keeps Running in Loop, Exceeding Expected Token Usage Feedback assistants-api	7	1565	July 5, 2024
Assistants API context tokens Number API assistants-api	4	995	December 4, 2023
Unexpected token counts of 800 however i have just created a run API api	3	137	December 13, 2024
Assistant API - way too much "input" tokens used API assistants-api , assistants-pricing	7	4910	September 6, 2024

Too many input tokens are used by Assistant

Related topics