Are the used tokens counted when request starts or ends?

randomstuff · October 27, 2023, 5:27pm

Hello,

i am trying to implement an algorithm for efficient user-api requests strategy that would maximize throughput without hitting OpenAI’s rate limits. As some of my queries take quite some time and i have multiple people interacting with a service.

Now i couldn’t find this information anywhere in the docs or this forum. Does somebody know whether are “used tokens” used at the start, when i do the request to the api, or at the end, when i receive the response from OpenAI?

sps · October 28, 2023, 2:20am

According to my understanding, for requests that require completion, tokens can only be counted once they have been generated, so it would make sense to count them in the end. However, it can also be a two-step process with prompt tokens being counted in the beginning and completion tokens being counted in the end.

For embeddings, they can be counted at the start.

randomstuff · October 28, 2023, 12:17pm

Thank you for the answer, i will try to use end-request for calculation and hope i don’t keep getting errors

Topic		Replies	Views
Token Rate limit estimation clarification API	0	693	December 14, 2023
[Question] How is token counted from retrieval tool? API question , api	2	998	November 14, 2023
How to Determine Token Usage Per Call and Total Post-Run in OpenAI API? API	0	482	May 15, 2024
How does rate limiting account for previous requests? API	0	165	March 28, 2024
Token count for completion call? API	6	2071	December 19, 2023

Are the used tokens counted when request starts or ends?

Related topics