Hello, I have a query regarding the pricing structure of the Assistant API, specifically related to the use of threads and conversation history. In the documentation, it’s mentioned that pricing is applied to both inputs and outputs, which is clear to me. However, I’m uncertain about how charges are calculated when a new message is sent within a thread that contains previous conversation history. My question is: are the charges applied only to the new message and its corresponding output, or are they also applied to the entire conversation history (i.e., history + current message) along with the output? This point isn’t explicitly covered in the documentation, and I would appreciate clarification on which method is used for pricing.
Does the pricing for the Assistant API charge only for the latest message and its output, or does it also include the cost of the entire conversation history within a thread?
Hi! Welcome to the forum!
The long and short of it is that you will be charged for everything
You will be charged for the whole thread, the retrieved documents, your new query, and the new output every time the thread runs.
You’re not saving any money by using assistants.
Hope this helps!