GET completion/{id} endpoint for usage backfill

We currently use the completions API with streaming. This means that the token usage is not provided in the response body. It would be great to call GET completion/{id} or something similar to capture token usage to backfill into our database. It seems that the OpenAI platform “usage” page’s frontend already utilizes such a GET call:


Yes, that’s a decent idea, but it is also latency and network resources completely unnecessary and which would delay the decision-making for the next round of a chat calculation.

The “call” can be to your own token count in your software. You know how much you send, you know how much you receive. You can count the tokens before you even send, in order to make decisions about how much other stuff you can send or how big a response to get back.

The only thing you can’t understand from the response is the price of failure.