I don’t think it eats your tokens because nothing was analyzed or generated
some users think rate limiting happens somewhere near the load balancer Client-side rate limiting - #10 by harjot.gill, and that the calculation doesn’t even use a proper tokenizer.
As how to deal with this: you’re supposed to use exponential backoff https://platform.openai.com/docs/guides/rate-limits/error-mitigation
just naively retrying will probably get you banned or throttled by cloudflare, so I wouldn’t recommend it.