My title basically summarizes the feedback in a nutshell: currently, there is no way to track the number of tokens you are using in the Assistants API like you can in the completions API. For many developers, this is probably the most important thing in terms of keeping costs in check and with an added level of ambiguity with retrieval, this is crucial.
I know we could probably implement a Tiktoken solution, but this is an added level of complexity and may not be 100% accurate with Tiktoken, especially if OpenAI has background system prompts that we are billed for.
Absolutely phenomenal work at Dev day and I am looking forward to the improvements in the coming weeks.
@logankilpatrick — I hate tagging you directly but this is a fairly pressing issue for most teams right now. Appreciate the help!
Update: OpenAI has added prompt + completed tokens in the Assistant’s API, thank you team!
I agree. The current situation is nuts. I’ve just got a rate limit error
“rate_limit_exceeded: You exceeded your current quota, please check your plan and billing details.”
Ok, which limit? No details?
Check my billing page - only 1/4 used for the month.
Check the limits for the used model. No help there as nothing is returned from the Assistant API calls to tell me how many tokens are used.
I don’t understand. The Assistant API is just a wrapper to the ChatGPT models, always sending the full details each time for each step of the conversation. Each time that happens the model should be returning the number of tokens used, just like the Completion API. Why is it so difficult to pass this information back to us?
And if you look at the fact that OpenAI is also dragging its feet on implementing streaming, it makes you wonder if they have any interest in serving developers at all, now that their GPT store is taking off.