Please explain the Tokens per minute metric

ChatGPT api has a token per minute limit.

But which tokens are ment by that? The amount of prompt tokens sent to chatgpt, the amount of tokens returned by chatgpt, or both prompt and completion tokens combined?

It’s both - and it’s complicated

the input tokens are estimated, and added to your max_tokens - so you can think of it as total token throughput per minute of sorts. they’re not actually using tiktoken at that level, it’s more of a flooding prevention sort of thing.

2 Likes