Documentation on Automatic Model Downgrading for Low-Tier Users during Peak Times?

irthomasthomas · January 18, 2024, 9:28am

Good morning,

I recall reading a note in the OpenAI documentation late last year about API traffic shaping.
The note suggested that during peak times, low-tier users may receive smaller models. However, I cannot find this note in the current documentation. I’d be most grateful if anyone could point me to its new location or confirm it has been removed.

Thank you.

EricGT · January 18, 2024, 10:33am

Good question.

I did a quick search and did not find anything.

While your question specifically notes API, it seems that the solution to traffic shaping is rate limits.

However, ChatGPT does seem to downgrade from GPT-4 to GPT-3.5 or some other model. I could not find official documentation on that but other users have noted this in recent months.

irthomasthomas · January 19, 2024, 2:52pm

Thanks, I did check that, as I assumed that’s where it was originally. Perhaps they have stopped the process now, but it would be nice to get some clarity. Recently, my API requests to chatgpt-3.5-1106 have returned an error saying that the max length is 4k tokens. Which I took to be the traffic shaping bumping me to smaller model. If it’s not intentional traffic shaping, then it’s a bug. If it is traffic shaping, then I’d like to know at what tier that stops?

Topic		Replies	Views
Hitting Rate Limit with small group of Users? API api-rate-increase	14	6483	January 20, 2024
Are there any rate limits when using GPT-4 through the API? API	2	1519	December 15, 2023
Different gpt-4 level models to mitigate rate limiting issues API	6	1083	September 13, 2023
Can I use GPT4 preview in production in Tier 4? Documentation unclear API gpt4	1	771	November 28, 2023
API for additional model details? API api	3	2247	June 20, 2023

Documentation on Automatic Model Downgrading for Low-Tier Users during Peak Times?

Related topics