Doubt on prompt tokens and completion tokens

kzainul22 · April 18, 2024, 9:59am

After successfully hitting open ai chat completion api in response we get a usage object written below : -
Model is turbo 3.5 gpt
“usage”: {
“prompt_tokens”: 5807,
“completion_tokens”: 312,
“total_tokens”: 6119
},
My question is about tokens I understand completion tokens is the tokens generated by open ai whose max_length is 4096,
Prompt tokens are the token we are feeding the api as an input what is the max lenght of it this is my first question ?
Second question is about total_tokens which is the sum of prompt and completion tokens . Is there a max length for this key too ?

jr.2509 · April 18, 2024, 10:37am

Limitations for prompt tokens are a function of models’ context window. Different models have different context windows, e.g. the gpt-4-turbo model series has a context window of 128,000 token while the regular gpt-4 model has a context window of 8,192 etc. You find the breakdown by model in the overview here.

The sum of prompt and completion tokens cannot exceed the context window. However, as you rightly understand, completion tokens are currently limited to 4,096 tokens. Hence, the maximum number of prompt tokens is the difference between the context window and the completion tokens.

You can also find official information about tokens here and here.

_j · April 18, 2024, 11:14am

Also a bit curious is that if you send the API parameter max_tokens, the value you use doesn’t just limit the output, it also acts as a reservation of tokens from the context length which are only for the output (by API error).

using gpt-3.5-turbo

Example with max_tokens:20 parameter:

{‘completion_tokens’: 20, ‘prompt_tokens’: 15007, ‘total_tokens’: 15027}

Same with max_tokens:2000:

‘message’: "This model’s maximum context length is 16385 tokens. However, you requested 17007 tokens (15007 in the messages, 2000 in the completion)

But then if unspecified, giving you a cutoff at 4096:

{‘completion_tokens’: 32, ‘prompt_tokens’: 15007, ‘total_tokens’: 15039}

Slightly minfied Python example with 'reservation blocking'

import openai;c=openai.Client();r=c.chat.completions.create(
messages=[{"role":"user","content":"Do chimps laugh?" * 3000}],
model="gpt-3.5-turbo",max_tokens=2000)
print(r.choices[0].message.content,"\n",r.usage.model_dump())

Topic		Replies	Views
Clarification for max_tokens API codex	10	94276	December 12, 2023
I need help using openai API API chatgpt , gpt-4o-mini	2	203	October 29, 2024
What exactly is "MAX TOKENS" in gpt-3.5-turbo model? API	2	16362	July 11, 2023
Maximum token length allowed API	9	33558	December 13, 2023
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	13710	January 11, 2024

Doubt on prompt tokens and completion tokens

Related topics