Vision - input image pricing
I though it would be more useful to show you a comparison of what these models cost to run with input images, rather than you getting some abstract formulas for tokens and multipliers and costs per million.
By model; 1920x1080, vs 768x512
Notably: OpenAI is multiplying the token cost of GPT-5.2 by 1.2x – mentioned NOWHERE, the most expensive model since original GPT-4o.
High-cost models (vs gpt-4o)
The large image will be downsized whether a “patches” or “tiles” model and is typical of what you might send.
The small image is a 2-tile image, not exceeding 512px in height: one you might deliberately craft.
Try your own image resolutions here in “comparison mode”.


