I came across this while checking the token usage for this month, using the https://api.openai.com/v1/usage?date= API from the official website. I found data that looks similar to the following:
I am not sure what n_context_tokens_total represents, but it exceeds the maximum limit of 4k tokens (reaching an astonishing 152k tokens). In my understanding, calling OpenAI’s API should not have any state, and each time I make a call, I cannot pass contents exceeding 4k tokens. Therefore, how was this record generated?
I calculated the number of tokens and money spent this month, and found that the sum of n_context_tokens_total + n_generated_tokens_total for each message was close to the tokens corresponding to the actual money spent. If the model itself cannot remember, why does n_context_tokens_total gradually increase to an unacceptable level with multiple calls?
It is spitting out the aggregation or total usage. Each call is limited to 4k, but there is no monthly cap per-se, unless you hit your API spending limit cap.
The tokens consumed per request is returned in each response.