Default value of `max_output_tokens` of `responses.create` for `gpt-4.1-2025-04-14`

Hello,

I would like to know what’s the default value of max_output_tokens of responses.create for gpt-4.1-2025-04-14.

If we don’t set this value, will it represent the maximum capacity of the model?

Could anyone help?

Thank you

Timur

PS: https://platform.openai.com/docs/api-reference/responses/create

I believe 2048. You can test this by instructing the model to repeat a sequence infinitely and checking token consumption or running it through the tokenizer.

I don’t see how to test it reliably, does AI always tend to give a response that’s close to max_output_tokens?

In general, it does tend to be close to the max_output_tokens, if specified (sometimes above, sometimes below). As for the default value, I don’t think that’s mentioned anywhere in the docs or visible in the source code.

To build upon @OnceAndTwice’s idea, you could find any piece of text that is close to 32K tokens, send that as part of the input tokens via the API, and prompt the model to repeat it verbatim. For reference, the max output token length is 32,768 tokens for the GPT 4.1 model. If the model returns the entire text, then one can assume that max_output_tokens represents the maximum capacity of the model.