Hi,
When adding messages to a thread and running the thread, does this process include everything in the thread history and feed it to the model? If yes, does this mean that all tokens (or maybe the ones not truncated) count towards the input token count?
I saw a few posts (it does not allow me to include links in my post for some reason) on how the assistant model’s API Pricing and token usage, but was never able to find an answer and those posts are closed.
Thank you!
1 Like
all of the prior messages are sent to the model each time, if you have stored files that are also used as context, those files can be used as well, again adding to the token count.
So, to clarify, the underlying AI is stateless. The assistants API manages your past conversations behind the scenes, but those past messages need to be sent to the model for every API call.
1 Like