This sounds like a “solve for x” problem.
The difference in cached tokens is 2560 between runs. 2560 is evenly divisible by (1024+n*128) token increments. That difference may originate by the very first API call not being cached, and the later ones employing the cache for common context. When we look at the other figures, there is no other solution than 2560 (up to 2687) being the size of the common instructions and schema input.
On the first run: 113 cached calls
On the second run: 114 cached calls
So over 10% are not being reported, or not hitting the cache mechanism despite commonality.
The larger discrepancy in total input can be the varying size the non-common data elements, and that cache return may not kick in for overlapping initial requests.
The fault in the first post, needing to be discovered from the title and know who’s dashboard you are talking about, is that you are trusting the platform site usage page to report your usage to you correctly, and for 15 minute splits to be as discrete as you hope. It has had faults in the last days as major as no billing showing up for at all for multiple users over consecutive days. When operating normally, usage can still trickle in. That’s the explanation you request.
You should log the usage statistics that are returned by the API call itself.
usage_list.append({index:response.usage})
, which can be further parsed, to even see if all API calls are successful or silently failing (which your saved data of your task should also show). Then you should see identity in the token count from the same input, and expect that to be your bill.