with the same question, I requested a JSON response using the same settings. GPT-4.1-nano and GPT-5-mini work fine, but GPT-5-nano returns null. with no error message. The question is only a few hundred characters long, quite short.
I tested on ChatClient and ResponseClient and got the same result.
Yeah, thatâs exactly why gpt-5-nano behaves like that. When I increase the max output tokens, it ends up using all of them. For example, if the limitâs 250, the reasoning tokens hover around ~300. But when I bump it to 1000, the reasoning tokens shoot past 1000 â which results in an empty response.
I updated the âreasoningâ {effort: âminimalâ} then the issue is resolved, or downgrade to gpt-4.1-nano.
Understand: The AI model doesnât know what you set max_output_tokens to.
This setting simply stops the AI generation after your setting is reached, regardless of whether that happens on unseen internal self-reasoning or when the generation has transitioned to an output.
In fact, there is a major flaw with the implementation: in-non-streaming, if you set max_completion_tokens on Chat Completions to be less than the entire output needed, you get nothing, not even a partial output. It is all just billed as âreasoningâ even though the model would have been producing the final output at the cutoff point.
Increase the number much higher, as gpt-5-nano produces much more internal reasoning text (making it rather dubious on a cost-basis).