Token Usage When Calling Assistants through the API

How does one get the token usage when calling the assistant? It is part of the API message response when calling the model directly, but strangely absent when calling assistants. This is vital for managing complex interactions that could exceed the model’s limitations since it affects the design using token mitigation techniques such as chunking or RAG.

One only needs to look at the “usage” in the run response object, which you can obtain after a run is complete just as you would poll for the status in that object while a run is in progress.

https://platform.openai.com/docs/api-reference/runs/object

1 Like

Thanks, that is very helpful! :grinning_face: