Assistant API token Usage - Token usage more than the whole attached file Plus prompts

The documentation answers that the Assistants agent framework pays no mind to your budget…

Retrieval currently optimizes for quality by adding all relevant content to the context of model calls. We plan to introduce other retrieval strategies to enable developers to choose a different tradeoff between retrieval quality and model usage cost.

“All relevant content” = all that will fit in the model’s context length.

The assistant and its internal functions for retrieval and other tools has its own language that also consumes tokens.

1 Like