Hitting max output token limit for 4.1-mini

havock2926 · July 28, 2025, 11:32am

I am trying to use the 4.1-mini model to generate some pretty lengthy JSON. When I do this, I get a response with the status as incomplete and the reason stated as “max_output_tokens” which is 32,768 for this particular model.
I then tried to set the max_output_tokens like so:

response = client.responses.create(
        model="gpt-4.1-mini",
        temperature=0,
        top_p = 1,
        max_output_tokens = 100000,
        input=[{"role": "system", "content": system_prompt.strip()}, {"role": "user", "content": questions}]
    )

This new limit is more than enough to generate the JSON. But when I try running this, I run into the same error. The API is still returning a max_output_tokens reached error at 32,768 tokens.
Why is my custom limit not working? And what can I do to solve this issue?
Thanks!

_j · July 28, 2025, 4:41pm

The model has an artificial limit where OpenAI has cut off the generation capability:

This is likely done both for the exponential increases in cost and because of untrained declining quality in long responses, usually only brought about by the AI “going nuts” (which may be the case here)

You cannot ask for more with an API parameter.

jai · July 28, 2025, 5:06pm

Welcome to the community @havock2926.

Every OpenAI model has maximum output token limit, and users cannot override this by using the API. These limits typically relate to, and are a function of the architecture of the model themselves.

If you would like to generate 100,000 output tokens, you could look at using either o4-mini or o3 models. These have a maximum output token limit of 100,000. If you want to continue using the 4.1-mini model, then you could look at breaking up your task creatively into subtasks whose outputs can easily be stitched together.

You can easily lookup the limits and compare each model on this official page. Hope this helps.

Topic		Replies	Views
Why is GPT-4.1 output capped at ~6,000 tokens despite a 32,768-token limit? API token	0	43	September 11, 2025
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	14185	January 11, 2024
GPT-4o-mini max token 16,384 API gpt-4 , api	2	2164	August 31, 2024
Only allowed to set max_tokens to 4095 API	4	673	May 17, 2024
Openai web search token limit issue Bugs	4	337	March 25, 2025

Hitting max output token limit for 4.1-mini

Related topics