Prompt tokes are much lower than the number mentioned in the response

Hi,

I have a prompt which gives me like 300 tokens using this tool:

https://platform.openai.com/tokenizer

I cannot provide my exact prompt here because of policy issues.
Maybe it can be answered in a general way. Maybe the tokens are counted differently?

But in the response of the GPT 4o model I have something like

completion_tokens=476, prompt_tokens=1509, total_tokens=1985

What can be the reason for this HUGE difference (1509 vs 300)? The number of prompt tokens differ by a factor of 5 !

Are you using Assistants or ChatCompletions

1 Like

I use the API and there chat.completions

Then we cannot help you besides throw random things in the air and hope it sticks.At the very least you can provide your code that you use to structure the prompt.

I’m 99% certain that you are not correctly structuring the conversation, and having data “leak” through.

What you need to do is create the conversation object, and then run it through a library like tiktoken and see what it says.

I am happy for any thing what comes to your mind although it is random maybe. But I think it can help.

i dont use “conversations” or what do you mean by data “leak”? It is just the standard (single) message object with system prompt and user input message sent to OpenAI.

btw: i also using the beta for inference:

    response = openai.beta.chat.completions.parse(
        model=model
        messages=message,  
        response_format=response_type,
    )

1: are you using images? They also consume tokens.
2: are you using chatbot software that provides the AI past turns of conversation? All the input of those messages of input is also “prompt tokens”.
3: are you actually using an o1 series reasoning model? They also consume tokens with internal generation.
4: are you providing system instructions? Are you providing tools? Those instructions and tool specifications also consume tokens.

etc.

Ok. I think I see the issue.

You are most likely following the Structured Outputs guide.

You are not counting the tokens in your schema.

Remove response_format=response_type, and the tokens will line up.

1 Like

I don’t understand. Can you explain? Btw I have to use structures output as this is necessary. How is this now related to the token count in the prompt (the response token count seems to be fine!)