Inquiry About Maximum Rate Limit for GPT-3.5-turbo-16k Model


We are developing a personalized chatbot platform where hundreds of users can freely access our chatbot. Currently, we are using the GPT-3.5-turbo-16k model, which I understand has a rate limit.

My understanding is that it’s possible to request an increase in the rate limit by submitting a request through the [Rate-Limit Request Form]

I would like to inquire about the maximum rate limit that OpenAI can potentially allow. This information will be crucial for us to design our chatbot architecture effectively.

Thank you for your assistance.

Hi and welcome to the Developer Forum!

You can find the first 3 tier details listed in the link below, the top limits will be by negotiation with OpenAI, you could start an enquiry with Contact sales, although it should be pointed out they are extremely busy at the moment.

is there a way to have stability with rate limits ahead of time or is it always going to be caped at some point?

because I would love to have the option for a plan that can theoretically grow at some sort of pre determined cost for rate limits.
it really helps with planing

By the looks of the current API billing models, it would seem that prepayment along with a period after the payment to ensure that payment is genuine will automatically move your account to the next rate limit tier.

but isnt there still a maximum rate limit? i seen up to tier 2 which is not al ot if you think of very large buissneses

While there is a general shortage of compute for AI globally and for every service provider currently on the market, there has to be a carful distribution of resources.

If your usage regularly begins to reach your current rate limits then your account should be automatically reviewed for increases as compute availability permits.

There is also the possibility that once your usage approaches 450,000,000 token per day that you may find the use of a dedicated instance preferable and that can be discussed via Contact sales.

oh so as a policy its possible to buy a specific server that runs the model on openai’s side? because that would make me much less concerned.

I am considering how much time and effort I want to put into learning to develop with gpt 4 verses runing things locally. and a big part of it is can big companys actually use gpt 4 effectively or is the rate limit too much of a risk.

so knowing that there is a guaranteed ability to get the scaling that you need for very large company s would make me much calmer

There is certainly a path to a dedicated instance running on Azure hardware that only processes your application. Compute availability issues should ease as hardware providers roll out more GPU’s and dedicated AI processors, so there will still be work to be done in terms of securing that compute for your application, but I do not expect this to be a significant consideration in 6-12 months.