Silent failure when hitting max_output_tokens limit in Responses API

jgarbers1 · August 20, 2025, 10:34pm

We’ve been porting our Python-based chat application to the Responses API using the official openai wrapper package. Things went smoothly until we started testing with a prompt that does complex reasoning. We did the streaming responses.create() call that kicked off the reasoning-based response, and we never saw any response.output_text events. We did see other events, including reasoning-related ones.

After noticing that the amount of output tokens in the stalled response was always exactly 2048, we realized that we were hitting a default output token limit, but never seeing any output, because a large number of output tokens were being consumed by the reasoning process. When we increased max_output_tokens in the create() call (by a lot!), the process completed successfully.

My continued concern is that the response failed silently, as far as we can tell, when it hit the output token limit. Sometimes we’d get a response.completed event as if the response had completed successfully; at other times the connection would simply be dropped.

Unless there’s something we should be doing that we aren’t, may I suggest that some error event be generated when the output token limit is hit instead of just “going dark” or dropping the connection?

Thanks in advance for any guidance or feedback you might offer!

aprendendo.next · August 20, 2025, 10:47pm

I’ve just described this issue in another topic.

Basically, if you are using prompt IDs, there might be a ‘legacy’ value that is messing around with your requests.

ps: it seems to be unrelated to legacy. I tested a new prompt, and confirmed that it was affected too.

Topic		Replies	Views
Inconsistent Token Limits with “o3-mini-2025-01-31” Model—Empty Response Despite Supposed Large Context? API api , limitations , system-limitation	2	1713	March 4, 2025
Responses API: empty output_text (no message item) when status=incomplete due to max_output_tokens (reasoning-only output) API codex , reasoning , responses-api	8	267	February 8, 2026
Assistant API v2: max_prompt_tokens gets exceeded, barely, consistently Bugs	5	1200	July 4, 2024
Incomplete API responses due to "max_output_tokens" limit during batch processing Bugs api , batch-api , responses-api	5	1833	August 28, 2025
Error message with longer inputs Prompting	5	2826	September 5, 2024

Silent failure when hitting max_output_tokens limit in Responses API

Related topics