Token Usage for Images Remains Constant Regardless of Size - Is This a Bug?

joswin86 · September 22, 2024, 6:55pm

Hi everyone,

I’ve been using the OpenAI API to process invoice summaries with images. I have a function that takes images and a prompt to generate a summary. Here’s the function I’m using:

async function processInvoiceSummary(images: string[], prompt: string) {
    const messages: any[] = [
        {
            role: 'user',
            content: [
                {
                    type: 'text',
                    text: prompt
                }
            ]
        }
    ];

    images.forEach((image) => {
        messages[0].content.push({
            type: 'image_url',
            image_url: {
                url: image,
                detail: 'high'
            }
        });
    });

    const completion = await openai.chat.completions.create({
        model: gptModel,
        messages: messages
    });

    const summary = completion.choices[0].message.content;
    return { summary, usage: completion.usage };
}

I am testing it with 4 images, and the result for token usage is always the same, like this:

{
  "prompt_tokens": 103310,
  "completion_tokens": 123,
  "total_tokens": 103433,
  "completion_tokens_details": { "reasoning_tokens": 0 }
}

tried changing the image sizes from the original 1153x1536 to 3756x5000, but the token usage still remains the same.

This doesn’t seem right, based on the documentation which mentions:

high will enable “high res” mode, which first allows the model to first see the low res image (using 85 tokens) and then creates detailed crops using 170 tokens for each 512px x 512px tile.

Am I doing something wrong here, or is this a known issue/bug? Any insights or advice would be greatly appreciated!

_j · September 22, 2024, 7:11pm

I just did a deep dive into what you can expect for token usage (and rate usage) for a variety of resolutions, detail settings, and models.

If the image resolution at detail:high takes the same number of tiles, the cost will be the same. This means anything from 513x513 to 1024x1024, or anything in between, results in 4 overlay tiles (on top of a base “low” image.)

There are also peculiarities in the internal downsizing even on detail:high. Your image will be downsized so the shortest dimension is at most 768 pixels. Send 3000x3000, the model sees 768x768 - 4 tiles of 512x512. Send 2000x500, the model sees 2000x500, also 4 tiles of 512x512.

sps · September 22, 2024, 7:42pm

Welcome @joswin86

It’s not a bug. This cost calculation is by design.

You can use the image cost calculators for your model of choice on the pricing page to calculate and get a breakdown of how the costs are calculated for an image size.

Here's the cost breakdown for a 1536x1536 image with gpt-4o-mini:

Price per 1M tokens (fixed)	$0.15
Resized width	768
Resized height	768
512 × 512 tiles	2 × 2
Total tiles	4
Base tokens	2833
Tile tokens	5667 × 4 = 22668
Total tokens	25501
Total price	$0.003825

Here's the cost calculation for the same model for an image of 3756x5000

Price per 1M tokens (fixed)	$0.15
Resized width	768
Resized height	1023
512 × 512 tiles	2 × 2
Total tiles	4
Base tokens	2833
Tile tokens	5667 × 4 = 22668
Total tokens	25501
Total price	$0.003825

_j · September 22, 2024, 8:11pm

I’m trying to get o1-preview to make an app, show what it can do. I provided all the rules and specifications.

The playground is designed to time out and take your money apparently. Over to my chatbot.

The first go was terrible: 100x500 = 16 high detail tiles? It couldn’t be improved.

Abandon. Second new extensive prompt covering every area of failure, even explaining the origin of tiles would start at corners, close but no cookie.

You get the idea though. How OpenAI could present this on the web.

joswin86 · September 23, 2024, 9:49am

Thanks for the insights!

Just to clarify, since my images always have the same ratio (A4 format), does that mean it doesn’t make any difference if I increase the resolution beyond 768px on the smaller side? From what I understand, as long as the smaller side exceeds 768px, the model will downsize it to 768px, and the token usage will remain the same regardless of any further increase in resolution, correct?

Appreciate the help!

_j · September 23, 2024, 1:24pm

You have reached a correct conclusion. As:

A large A4 paper image (actually any A paper size in tall aspect ratio) would always resize to 1087x768. That then would consume six token tiles of high detail, as the longest dimension exceeds two tiles.

You can consider then the economy of sending 1024x725 as the image, where the expense would drop to four high quality tiles.

Or consider the quality increase if you were to use this strategy:

The page is sized to 1024x1448 by your code.
You take a vew of the top at 1024x768, and a view of the bottom at 1024x768
88 pixels of overlap between the two images give some commonality for the vision to join.
Those two images placed into the same user message.
Paying for two four tile images instead of one single tile image

The AI would have higher resolution text and more tokens of encoded image in general to contain information.

_j · September 23, 2024, 2:32pm

OpenAI GPT-4 computer vision image pricing

Comparative breakdown by model

How much does it cost to send the same image to different models?

Here is detail:low (detail high can be up to 17 times greater cost)

Model	1M tokens ($)	Tokens X	Per 1k	Cost Ratio
gpt-4o-2024-08-06	$2.50	1	$0.213	1
gpt-4o-mini-2024-07-18	$0.15	33.33	$0.425	2
gpt-4o-2024-05-13	$5.00	1	$0.425	2
gpt-4-turbo-2024-04-09	$10.00	1	$0.850	4
gpt-4-0125-preview	$10.00	1	$0.850	4
gpt-4-1106-vision-preview	$10.00	1	$0.850	4

gpt-4o-mini: The model-received image tokens are multiplied in reporting and billing
aliases that point to these actual model names are excluded

Topic		Replies	Views
GPT-4.1 vision price calculations -- incorrect billing on full model Bugs bug , gpt-4-vision , gpt-41	7	454	April 24, 2025
Help understand token usage with vision API API gpt-4-vision	7	2643	February 12, 2025
Unexpected Token Discrepancy in GPT-4o Mini Vision Billing vs. API Usage Bugs api	2	358	February 5, 2025
Are the vision tokens added to the tokens per request limit? API	4	296	September 16, 2024
GPT-4-o-Mini Vision Token Cost Issue API gpt-4-vision , cost	2	1057	March 26, 2025

Token Usage for Images Remains Constant Regardless of Size - Is This a Bug?

OpenAI GPT-4 computer vision image pricing

Comparative breakdown by model

How much does it cost to send the same image to different models?

Related topics