What is the token cost for image prompt in GPT-4o?

Hilman · June 5, 2024, 6:20am

I am testing the capability of GPT-4o (Azure OpenAI) on image captioning by using the Python API. According to the OpenAI guide here, the token needed to process image with "detail": "low" should take 85 of tokens. However, my experiment shows that only 73 tokens are being used. Am I missing something here?

This is part of the output from the GPT-4o response. The prompt_tokens shows that only 73 of tokens are consumed (I only sent the image with no other prompt).

ChatCompletion(..., model='gpt-4o-2024-05-13', ..., usage=CompletionUsage(completion_tokens=97, prompt_tokens=73, total_tokens=170),...)

_j · June 5, 2024, 6:55am

Yes, GPT-4o has a different token encoder, meaning that the same information can be encoded differently.

How exactly images are tokenized or embedded for understanding at all is OpenAI’s secret, with one “tile” not coming out to an even number, so you can just roll with “the consumption is lower and a bit unpredictable”.

Hilman · June 6, 2024, 12:07am

That explained everything. Thanks!

Topic		Replies	Views
How to count image input tokens for GPT-4o-mini API	1	765	August 15, 2024
Cost of Vision using GPT-4o API api , pricing , gpt4-vision , gpt-4o	1	16606	May 27, 2024
300 tokens for a 512x512 image at "detail: low" API chatgpt , api , gpt-4o	0	150	February 11, 2025
Unexpected High Token Usage in GPT-4o API Response API gpt-4 , gpt-4-vision , gpt-4o	3	661	September 6, 2024
Conflicting Info About the Cost of detail:low images Documentation api	4	1078	March 6, 2024

What is the token cost for image prompt in GPT-4o?

Related topics