Understanding GPT-Vision API pricing?

_j · May 12, 2024, 6:22pm

I count the same 227 completion tokens by reassembling your assistant text and sending to a tokenizer.

The user message is 7 tokens overhead + 12 tokens of message.

So if we have 765 tokens of prompt still to account for, it must be from images.

Total tiles	4
Base tokens	85
Tile tokens	170 × 4 = 680
Total tokens	765

The internal resizing of the smallest side of the image is what makes 640x640 and even up to 1024x1024 take 4 “tiles” that are 512x512 in the detail:high mode.

For an image 640x640, detail:low will make it only cost 85 tokens with the image being downsized to 512x512 internally.

Topic		Replies	Views
Cost of Vision using GPT-4o API api , pricing , gpt4-vision , gpt-4o	1	17611	May 27, 2024
Need help in understanding the pricing of image generation using gpt-image-1 through api API gpt-image-1	1	258	August 6, 2025
Unexpected Vision Pricing Bugs gpt-4 , api	1	1034	May 9, 2024
Clarification on Token Usage for Image Inputs API api	4	190	September 10, 2025
How is pricing calculated when using /v1/responses with gpt-image-1? API gpt-image-1 , responses-api	1	112	September 15, 2025

Understanding GPT-Vision API pricing?

Related topics