im currently using the new Assistant api and I’m having trouble calculating the usage fees.
Suppose I write a prompt for an Assistant, and the user upload a file A each time when he inputs, at the n-th round of question, my input token will be:
- system prompt
- (n-1) * file A token
- sum of all n rounds token questions
Is this correct?
If it is, is there any way that I can shrink the (n-1) file A token to just 1? (I’m assuming I can do this if every use’s thread requires the same file A and I upload file A as retrieval. But I want every user to be able to upload a different file going into a different thread. )