How do I calculate image tokens in GPT4 Vision?

Locust2520 · November 10, 2023, 12:29pm

According to the pricing page, every image is resized (if too big) in order to fit in a 1024x1024 square, and is first globally described by 85 base tokens.

Tiles

To be fully recognized, an image is covered by 512x512 tiles.
Each tile provides 170 tokens. So, by default, the formula is the following:
total tokens = 85 + 170 * n, where n = the number of tiles needed to cover your image.

Implementation

This can be easily computed this way:

from math import ceil

def resize(width, height):
    if width > 1024 or height > 1024:
        if width > height:
            height = int(height * 1024 / width)
            width = 1024
        else:
            width = int(width * 1024 / height)
            height = 1024
    return width, height

def count_image_tokens(width: int, height: int):
    width, height = resize(width, height)
    h = ceil(height / 512)
    w = ceil(width / 512)
    total = 85 + 170 * h * w
    return total

Some examples

500x500 → 1 tile is enough to cover this up, so total tokens = 85+170 = 255
513x500 → you need 2 tiles → total tokens = 85+170*2 = 425
513x513 → you need 4 tiles → total tokens = 85+170*4 = 765

`low_resolution` mode

In “low resolution” mode, there is no tile; only the 85 base tokens remain, no matter the size of your image.

Topic		Replies	Views
Token Usage for Images Remains Constant Regardless of Size - Is This a Bug? API	6	4529	September 23, 2024
Cost of Vision using GPT-4o API api , pricing , gpt4-vision , gpt-4o	1	15652	May 27, 2024
Help understand token usage with vision API API gpt-4-vision	7	2399	February 12, 2025
Understanding GPT-Vision API pricing? API	1	2614	May 12, 2024
Gpt-image-1 collected pricing information - and why Responses is undocumented API pricing , gpt-image-1	0	522	June 1, 2025

How do I calculate image tokens in GPT4 Vision?

Tiles

Implementation

Some examples

low_resolution mode

Related topics

`low_resolution` mode