Please explain the Tokens per minute metric

It’s both - and it’s complicated

the input tokens are estimated, and added to your max_tokens - so you can think of it as total token throughput per minute of sorts. they’re not actually using tiktoken at that level, it’s more of a flooding prevention sort of thing.

2 Likes