I’ve been experimenting with the Assistants API and noticed that there’s no way to get token usage (input and output) like you can with the Chat Completions API. Anyone have an idea how to get that information?
I’m running some evaluations around price and want to make sure what I’m implementing makes sense for my use-case.
On a basic level, you know the length of each thread that is ran and the size of the responses generated and with tiktoken you can count those.
Right but it was really nice just having that information back in the response.
Extra overhead to use tiktoken and may be imperfect numbers coming back because I don’t think we know exactly what’s being processed during a run.
It looks like they’ll be adding that at some point.
Thanks for the reference here! Good to know it’s something they are thinking about and totally appreciate not everything can be delivered day 1.