In non-stream mode, when choices have multiple returns

in the chat-completion api, response object, Why is choices an array? When will multiple elements be returned?

1 Like

The choices array in an OpenAI chat completion response holds multiple completions for a single input - when you request that over the API. This allows for exploring diverse outputs, improving quality by selecting the best option, having another round of AI judge which is best, or providing a fallback in case of quality issues in one generation.

The n parameter in the API request controls the number of completions to generate. By default, n is set to 1, resulting in a single completion and a choices array with a single element. However, if you increase the value of n, the API will generate multiple completions, each represented as an object within the choices array.

In conjunction, it is important to ensure diversity by not using a low temperature or top_p - or you will pay multiple times for similar answers.

You pay once for the input and then pay for every output in tokens.

n:

  • Type: Integer or null
  • Optional: Yes
  • Default: 1
  • Description:
    • The number of chat completion choices to generate for each input message.

Note: In the legacy “completions” API with completions AI models, the best_of parameter (which operates on multiple completions and returns the best) offers similar functionality used with n. You get more results than just the top. However, best_of is not available in the “chat” endpoint.

I hope this is helpful! Let me know if you have any other questions.

2 Likes