Pricing based on actual or requested output length

dstekol · March 28, 2024, 2:35pm

The OpenAI Pricing Documentation seems a bit ambiguous:

Requests are billed based on the number of input tokens sent plus the number of tokens in the output(s) returned by the API.

Your request may use up to num_tokens(input) + [max_tokens * max(n, best_of)] tokens, which will be billed at the per-engine rates outlined at the top of this page.

In the simplest case, if your prompt contains 200 tokens and you request a single 900 token completion from the gpt-3.5-turbo-1106 API, your request will use 1100 tokens and will cost [(200 * 0.001) + (900 * 0.002)] / 1000 = $0.002.

The “returned by the API” part suggests it is based on the actual tokens returned, regardless of how many I requested (I’m assuming the amount you request is determined by max_length). However, in their example, they say you “request a single 900 token completion” and then go on to use the 900 number in the calculations, seemingly implying that if you requested 900 you get charged for those 900.

So, if I send an API request with max_length set to 900, but the actual completion only contains 450 tokens, then (ignoring input tokens) am I charged for the 900 tokens in the request or the 450 in the response?

sps · March 29, 2024, 1:36am

You only get billed for what you consume and that amount is deducted from your prepaid credits.

Cost = [input tokens]x[input token cost] + [total generated tokens]x[output token cost]

You can check your usage for every API request with the usage object that’s part of the response you receive.

Topic		Replies	Views
Pricing question (Does OpenAI calculate bill based on actual usage or max_tokens) API	1	1200	November 6, 2023
Clarification on token pricing for multiple completions (n>1) in a single API call" API pricing	1	369	July 3, 2024
Why does pricing vary by input tokens (instead of only output tokens)? API	3	9907	November 16, 2022
About the charge for 1000 tokens? API	2	2596	November 7, 2023
Does OpenAI charge for both prompt and completion? API	3	2074	November 7, 2023

Pricing based on actual or requested output length

Related topics