While investigating the missing cached inputs on the Usage dashboard, we went through and started matching our saved token values vs. what we’re seeing on the dashboard (which is nothing):
- Polling Usage API for completions object for a single day (
https://api.openai.com/v1/organization/usage/completions
) we see zero forinput_cached_tokens
, yet the completion object returns for those day show cached tokens (saved when streamed or returned with (https://api.openai.com/v1/chat/completions
). Is this expected behavior? - Are cached tokens from the AssistantsAPI available for the 50% and where can we see that using the Usage API? We’re assuming it’s a different request than the completions above, as we have hundreds of thousands of cached tokens a day that are not showing up in the
usage/completions
object. They do show up and are recorded at the time of therun.completion
, however.
To wrap-up:
- our Chat Completions have some cached tokens per day, yet the Usage API is reporting zero.
- our Assistants Runs have hundreds of thousands of cached tokens, yet were seeing nothing like that reported anywhere
The clarity and attention would be appreciated as we planned for the months following based on the initial rollout and results from caching when first released, and that seems to have drastically changed.