How does `n` parameter work in chat completions?

_j · July 6, 2023, 2:59am

You only “send” once, the only change to the API call is the value of the N number, and you are only charged for the extra tokens of the extra output cases.

While technical details aren’t made available, the most logical way is to load the input into the context of the model, and simply capture, wipe, and repeat the output generation that occurs at the generation start point in the context length area where the response is formed.

Topic		Replies	Views
Questions on setting n and max_token API	4	944	March 20, 2024
Clarification on token pricing for multiple completions (n>1) in a single API call" API pricing	1	439	July 3, 2024
Do I need to increase `max_tokens` when using `n>1` e.g. `n=3` for generating multiple chat completions API	8	2073	July 2, 2023
Is the max_tokens parameter of the completions endpoint applicable for ALL or EACH response? API	7	2294	July 3, 2023
Multiple prompt responses everywhere API	6	3663	December 25, 2023

How does `n` parameter work in chat completions?

Related topics