Why am I hitting 300_000 tokens limit on GPT4.1. which should have 1M context length?

jakub.szlaur · May 2, 2025, 12:51pm

Hello everyone

I am sending requests to the new GPT4.1. model via Langsmith playground (look at the picture).
There is around 330000 tokens in the message
From this I get an error with this message:

openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 300000 tokens. However, your messages resulted in 330294 tokens (including 57 in the response_format schemas.). Please reduce the length of the messages or schemas.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘context_length_exceeded’}}

According to the documentation there should be 1M tokens am I right?

And yes I am 100% sure I am using GPT4.1. :

invocation_params

_type: “openai-chat”

model: “gpt-4.1-mini”

model_name: “gpt-4.1-mini”

response_format: “<class ‘backend.edmund.tools.eplan_tool.pydantic_models.RerankingFormat’>”

stop: null

stream: false

temperature: 0.1

sps · May 2, 2025, 9:29pm

Hi @jakub.szlaur

I was unable to reproduce this in my tests.

CompletionUsage(completion_tokens=23, prompt_tokens=343222, total_tokens=343245, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0))

Here's the code I used

import openai
from pydantic import BaseModel
from longst import longst
client = openai.OpenAI()

class Typos(BaseModel):
    pos: str
    word: str
    correction: str

text = longst*3


r = client.beta.chat.completions.parse(
    model='gpt-4.1-mini',
    messages=[{"role": "user", "content": text}],
    response_format=Typos
)

print (r.usage)

longst is just a very long string i.e around 141k Tokens long.

Can you try to test if this issue occurs when you run this code?

Topic		Replies	Views
Gpt-4-1106-preview 16385 max context tokens? (not output, total) API gpt-4	2	3106	December 12, 2023
Gpt-4-1106-preview: 400 This model's maximum context length is 4097 tokens API api , token , gpt-4-turbo	8	5448	March 18, 2024
GPT-4 API only supports 4096 context length? API gpt-4 , api	5	2366	December 19, 2023
GPT-4o Context Window is 128K but Getting error model's maximum context length is 8192 tokens, however you requested 21026 tokens API	9	8142	October 21, 2024
API \| Max Token Error \| Tier 4 \| Fluctuating between 128000 and 4096 Bugs api	3	3432	November 30, 2023

Why am I hitting 300_000 tokens limit on GPT4.1. which should have 1M context length?

Related topics