Context length vs. Token Limits

jonazahn · July 11, 2024, 8:47pm

I am working on a chatbot for which I need a large context window. Since gpt-4(o) has a context window of 128’000 tokens, I decided to use it.

Now I have run into the problem that I often get error messages like this:
RateLimitError: Error code: 429 - {‘error’: {‘message’: 'Request too large for gpt-4o in organization org-3cH9ytW9RJ5R0ZJMUH7cfDSj on tokens per min (TPM): Limit 30000, Requested 43385. The input or output tokens must be reduced in order to run successfully.

It took me a while to figure out what the problem is. Apparently, there is a rate limit on tokens per minute for the gpt-4o model that is set to 30’000 and this rate limit for TPM is different from the context length of 128’000.

And here comes my question: Even though the TPM limit is different from the context length, doesn’t this in the end amount to having a context length of max 30’000 tokens when using the gpt-4(o) model via the API?

My thought is that I am never able to insert more than 30’000 tokens (in fact less because the tokens generated in the answer probably also count in regards to the TPM limit) into the context window of the model since this would amount to exceeding the TPM limit.

If my thinking here is correct, doesn’t this mean that it is somewhat misleading by OpenAI to claim that the context length of gpt-4(o) context length is 128’000 when in fact it is not possible to make use of it due to the TPM rate?

Thank you for your insights and best wishes from Germany.

icdev2dev · July 11, 2024, 8:53pm

TPM is dependent on your Tier level.

Topic		Replies	Views
Please tell me the maximum number of tokens for GPT-3.5-turbo-1106 API api , token , pricing	4	20071	January 15, 2024
Context limit smaller than documented Bugs realtime	2	231	February 26, 2025
GPT-4o Context Length Issue: Input Tokens Within Limit but Exceeds Maximum API	3	1298	February 1, 2025
Token/Tier Limits for account API gpt-4	0	187	December 2, 2024
Subject: Issue with Token Limit for `gpt-4o-mini` Model in `v1/chat/completions` API Documentation gpt-4	3	1609	September 3, 2024

Context length vs. Token Limits

Related topics