+1
It would be beneficial to have usage metrics per assistant, per thread, per message, and per run request, similar to the call reply in the completion API. My objective is to divide usage per assistant and prevent the sending of messages once a specified usage limit (in tokens) is reached. This is a crucial API feature for many of us, and it should ideally be implemented already.
Are you currently working on this feature?
Thank you.