Instructions consume tokens, chat history consumes tokens, so does retrieval of data from your stored files, so does sending information from the code interpreter and tool calls. The GPT models are stateless, everything has to be sent to the model from scratch every time you make an API call.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Token use for updating instructions in an assistant | 1 | 1391 | December 16, 2023 | |
Token Optimization for Assistants API - Excesive token count | 2 | 2394 | May 24, 2024 | |
Assistants API Pricing Using GPT-4 | 1 | 7578 | December 27, 2023 | |
Do assistants count messages in the thread against the tokens limit? | 3 | 1628 | December 17, 2023 | |
Clarification on token calculations with OpenAi Assistants | 1 | 60 | September 25, 2024 |