OpenAI Response API - Limit number of outputs

mahavirn · April 18, 2025, 1:32pm

I am using OpenAI newly launched responses api with GPT-4o model, and my requirement is to restrict the number of outputs per api call. Even when i asked it to do so why system/developer message, still it sometimes returns more than 1 output object.

Is there any parameter to control this and force OpenAI to generate only 1 output per response.

Thanks!

_j · April 18, 2025, 5:03pm

There is no specific control that covers what you likely experience.

However, what you experience is hard to understand, with the term “number of outputs” not referring to anything clear-cut about API model behavior.

The only think I can think you might be talking about is previously-seen issues where a structured output JSON is not immediately followed by a token to end the response creation, but instead, the model continues writing a second JSON object sometimes.

Otherwise, it could just be prompting technique vs a model not following along.

You can start with reducing top_p from its default of 1.00 to 0.10 and see if more “best choices” in token production get you more of the expected response style.

mahavirn · April 18, 2025, 5:16pm

Actually, the response API is returning more than 1 output object,

So here is usecase:

I have list of 10 items, say
A, B, C…J

I am asking openAI to return one item per API call, and i will make total 10 API calls, and it should return 1 item per call,

But sometimes, it returns two items per call. So thats the problem.

_j · April 18, 2025, 7:34pm

I can stimulate such, but at least I can provide a particular model and example, and use a poor pattern to show that a new model is not providing a high-enough prediction certainty of ending responses after a JSON. Which: cannot be tuned, because there is no logit_bias parameter.

Conditions

gpt-4.1-nano: temperature/top-p: default.
dealing with the fault that the Prompts Playground damages output presentation, by formatting text. Undesired reversion to “store”:true in Playground.

System message

assistant task: randomly output only one "model" from this list of items, never duplicating a past answer. If the possibilities are exhausted, you output "error".

[gpt-4o, gpt-4o-2024-05-13, gpt-4o-2024-08-06, gpt-4o-2024-11-20, gpt-4o-audio-preview, gpt-4o-audio-preview-2024-10-01, gpt-4o-audio-preview-2024-12-17, gpt-4o-mini, gpt-4o-mini-2024-07-18, gpt-4o-mini-audio-preview, gpt-4o-mini-audio-preview-2024-12-17, gpt-4o-mini-realtime-preview, gpt-4o-mini-realtime-preview-2024-12-17, gpt-4o-mini-search-preview, gpt-4o-mini-search-preview-2025-03-11, gpt-4o-mini-transcribe, gpt-4o-mini-tts, gpt-4o-realtime-preview, gpt-4o-realtime-preview-2024-10-01, gpt-4o-realtime-preview-2024-12-17, gpt-4o-search-preview, gpt-4o-search-preview-2025-03-11, gpt-4o-transcribe]

Response: a single line of a JSONL. No whitespace nor linefeed allowed. "model": {"type": "string"}

Example response:
`{"model": "gpt-5-extreme"}`

User input: none; just keep whacking “send” in a conversational context.

if you use a text->format with a response json_schema including "strict":true, you should be able to more accurately produce one object per response. A second JSON in the same response would be a grammar enforcement failure.

Topic		Replies	Views
GPT-4o giving only one suggestion despite asking for multiple; GPT-4o-mini hallucinating when prompt gets detailed Prompting gpt-4 , hallucinations , gpt-4o-mini	2	250	October 24, 2025
Openai web search token limit issue Bugs	4	653	March 25, 2025
The response API generated many answers for one request API gpt-4 , chatgpt	0	134	April 27, 2025
Multiple Outputs from Responses API API responses-api	1	622	June 17, 2025
How can I make sure I always get a consistent output from OpenAI? API assistants-api	2	540	November 18, 2024

OpenAI Response API - Limit number of outputs

Conditions

System message

User input: none; just keep whacking “send” in a conversational context.

Related topics