Please allow generation of restricted client api tokens!
Routing client requests through backend servers defeats the purpose of all the amazing work done on improving GPT-4o latency. It can double the latency in some cases (320ms is advertised for GPT-4o and a backend route can easily add another 300ms).
Minimum features requested:
- API call to generate a client token.
- Set expiry.
- Set rate limit.
- Set allowed endpoints.
If you agree, please add a like!