Clarification on token pricing for multiple completions (n>1) in a single API call"

rishi4 · July 3, 2024, 1:32pm

I’m working on a project that requires multiple completions for a single prompt using the OpenAI API. I’m trying to understand how the token pricing works when requesting multiple completions (n>1) in a single API call. Specifically:

Are input (prompt) tokens charged once or multiple times based on the number of completions requested?

I’ve looked through the documentation but couldn’t find clear information on this specific topic. The documentation does say, “Note that you will be charged based on the number of generated tokens across all of the choices. Keep n as 1 to minimize costs.”

_j · July 3, 2024, 5:16pm

This is a relatively easy question to answer.

GPT-4-turbo-2024-04-09:

“usage”: {
“prompt_tokens”: 211,
“completion_tokens”: 100,
“total_tokens”: 311
},

GPT-4-turbo-2024-04-09 @ n=10:

“usage”: {
“prompt_tokens”: 211,
“completion_tokens”: 1000,
“total_tokens”: 1211
},

In this case, sending with top_p:0.001, n>1 is of little value, because it provides almost identical answers…

Topic		Replies	Views
Does OpenAI charge for both prompt and completion? API	3	1919	November 7, 2023
Pricing based on actual or requested output length API	1	357	March 29, 2024
About the charge for 1000 tokens? API	2	2498	November 7, 2023
Pricing question (Does OpenAI calculate bill based on actual usage or max_tokens) API	1	1121	November 6, 2023
Completion API 1K token minimum charge, even for fewer tokens API	6	2862	May 22, 2023

Clarification on token pricing for multiple completions (n>1) in a single API call"

GPT-4-turbo-2024-04-09:

GPT-4-turbo-2024-04-09 @ n=10:

Related topics