Understanding GPT-4 API pricing with respect to roles and request/response

80 · May 6, 2023, 10:25pm

GPT-4 8k model price:
3 cents per 1k tokens for the prompt
6 cents per 1k tokens for the completion

Is “prompt” here another word for (http)request?
And “completion” means response?

I see another interpretation: it depends on the role:

{
  "model": "gpt-3.5-turbo",
  "messages": [
    {"role": "user", "content": "Hi, can you help me find a good restaurant nearby?"},
    {"role": "assistant", "content": "Of course! What type of cuisine are you in the mood for?"},
    {"role": "user", "content": "I'm feeling like having Italian food tonight."},
    {"role": "assistant", "content": "Great! I recommend trying out Trattoria Toscana. It's a cozy little Italian place with great pasta dishes."},
    {"role": "user", "content": "That sounds perfect! What's the address?"},
    {"role": "assistant", "content": "The address is 123 Main Street. Enjoy your meal!"}
  ]
}

Here we would concatenate all contents from the user role, and this gets then tokenized and costs 3 cents per 1k tokens. While at the same time the assistant contents are computationally different and thus cost 6 cents per 1k tokens.

Also: am I right that the JSON skeleton of my request does not count towards the token limit? Only the concatenations of the “content” values?

PaulBellow · May 6, 2023, 10:54pm

You’re charged for every token going in and every token coming out. With practice, you can learn how to pare down prompts to the bare minimum to get what you’re after for the best price. Alternatively, in some cases, it might be possible to craft a prompt and give it one or two examples (one-shot or two-shot) and get as good results for a lot cheaper $0.002/1000 tokens for GPT-35-turbo…

80 · May 6, 2023, 11:12pm

Okay good, so indeed “prompt” means “https request” while “completion” means “http response”. OpenAI just preferred using some different wording here.

I have already been able to develop some routines to greatly reduce the token count. For example today from 904 down to 161 tokens. Anyway, thanks for the response, I don’t want to go too offtopic here!

joshua.rubin · September 7, 2023, 10:53am

sorry to reopen an old question, but what about the json skeleton? I’m still confused about this: Am I charged also for the tokens in the word “model”, “messages”, “role” and “content” every time they appear? What about the tokens of “{”, “}”, “[”, “]”?

80 · September 13, 2023, 1:35am

My understanding is that the json skeleton does not count towards the token count. Instead we take the concatenation of all "content"s of our “messages”, tokenize that and do the same with the engine response. Only those tokens count.
But yes, if I send ten messages with 100 tokens each (i.e. the value corresponding to “content”) then I’ll have to pay for 1k tokens. If the response is 500 tokens then those need to be paid too.

_j · September 13, 2023, 2:24am

You only count the tokens within “content”.

However the role messages and the call itself does have overhead from the unseen format by which these messages are passed via AI language.

First role: 7 tokens
Additional role: 4 tokens

Topic		Replies	Views
Can someone explain me the pricing model API	6	9698	February 22, 2023
Clarification on token pricing for multiple completions (n>1) in a single API call" API pricing	1	391	July 3, 2024
What is the Prompt and Completion price in GPT-4 api? API gpt-4	10	11222	June 12, 2023
Understanding billing of usage API gpt-4 , api	7	2002	February 16, 2024
Does OpenAI charge for both prompt and completion? API	3	2100	November 7, 2023

Understanding GPT-4 API pricing with respect to roles and request/response

Related topics