Questions on setting n and max_token

  1. What is the difference between setting the parameter n and sending the same requests n times? e.g., if I set n=5, and get 5 choices, how does this differ from I send the request 5 times?

  2. Will max_token impact the length of the output, besides cutting it when it exceeds the limit? e.g., if I set max_token to a small number versus a large number, will the response I get differ in the length significantly?

Thanks!

Welcome to the community!

I think if you use the n parameter you only pay for the input tokens once. if you make 5 calls, you pay 5 times for both output and input. But I could be wrong, documentation on that is becoming spotty. The utility of that is pretty limited.

Nope. It will just cut it off. It has absolutely no bearing on the quality of the generation.

3 Likes

Thank you! I get the cost part of setting n. But will it impact the quality / similarity of the responses I get, assuming all the other parameters are the same?

It’s a tough question.

Generally not really, although you may get different model fingerprints with different calls, giving you slightly different results. https://platform.openai.com/docs/api-reference/chat/create#chat-create-seed

I wouldn’t worry about it.

Your first question has been answered and explored here:

This is helpful when you want to check if the model replies according to your expectations. One can send the same request 10- 10,000 times and evaluate the replies, assessing if the deviation from the required reply quality is 1% or 5%.

That’s going to save time and money.

1 Like