OpenAI team, thanks for the work you’re doing. I’m thrilled with today’s news.
I have a few questions about the Assistants API pricing structure that don’t seem to be documented anywhere:
- When are we charged? Is it upon initiating a run or when adding a message to the thread?
- How are tokens calculated? Are we charged for the entire thread on each conversation turn (i.e., run)?
- What about token calculation for a long thread that you guys might’ve truncated on the background?
- How does token calculation work with knowledge retrieval?
- How can I estimate the number of tokens before each run?
The Assistants API takes on a lot of the backend work, but the pricing benefits are not clear. Developers could lose some control, which might even lead to extra costs.
Thanks!