I did search for similar threads as I assumed this will be a popular question but the few I found were unanswered and old.
I am building an assistant within the OpenAI assistant interface online which I will call from my app and was wondering how tokens are calculated. My assumption is the following.
If I create a thread in my app and send message 1, then the input tokens will be calculated from Instructions (System Prompt) + Tools/Functions defined (i.e their definition) + Files Uploaded. The output tokens will be from Response 1 that is generated.
For message 2, will I be billed for input tokens of the System Prompt + Tools + Files uploaded + Message 1 + Response 1? Output tokens will obviously only consider response 2.
I understand that the context window will be limited to a max token size depending on the model used.
Thanks in advance.