Clarification on Token Usage for Image Inputs

Tina_Marcroft · September 9, 2025, 9:27pm

Where is the token usage for image inputs visible? If I look at the OpenAI dashboard, does the “images” token usage graph only refer to image outputs and not inputs?

Thanks! Sorry if this is basic, I’m a newbie!

aprendendo.next · September 9, 2025, 9:45pm

Welcome to the forum. No worries, we’re all newbies.

Images are converted into tokens as they are used as inputs, depending on the model.

In the pricing page, roll down to the bottom and you will see a calculator hidden in the FAQ section:

How is pricing calculated for images?

Images are converted into tokens and charged per token. Text models price image tokens at standard text token rates, while GPT Image and gpt-realtime uses a separate image token rate. Models like gpt-4.1-mini, gpt-4.1-nano, and o4-mini convert images into tokens differently. Learn more in our docs⁠(opens in a new window).

_j · September 9, 2025, 9:58pm

A: Vision: exposed and metered nowhere separately.

You get to see “audio” in chat completions usage because it is billed at a different rate per token.

I made a web utility to make the calculation a hair more transparent, and you can enter dimensions or try images:

Discovery

(also ensuring that gpt-5 vision is not overbilling for “low”)

detail:low, 512x513

— Testing
The image is a high-contrast black-and-white checkerboard pattern. It consists of an even grid of alternating black and white squares arranged in rows and columns, with each square the same size. The pattern repeats uniformly across the entire image.

input tokens: 87	output tokens: 57
uncached: 87	non-reasoning: 57
cached: 0	reasoning: 0

detail:high, 512x513

— Testing
The image shows a black-and-white checkerboard pattern filling the entire frame. Squares of equal size alternate between black and white in both rows and columns, creating a grid of repeating checks.

input tokens: 367	output tokens: 46
uncached: 367	non-reasoning: 46
cached: 0	reasoning: 0

“Images” as a usage category is for generations by DALL-E models or gpt-image-1, dedicated to making AI pictures.

jeffvpace · September 10, 2025, 12:49am

@_j @aprendendo.next

Well, all I can say is that gpt Image-1 edits are way too expensive. Actually, this is the only beef I have with OpenAI, so far…

aprendendo.next · September 10, 2025, 1:56am

True… I hope the sequel for gpt-image-1 gets cheaper, and perhaps a bit faster. The competition is getting pretty close lately.

Topic		Replies	Views
Gpt-image-1 vs GPT Image 1 API	3	206	November 10, 2025
Responses API Image Generation Token Usage API api-usage , gpt-image-1 , responses-api	3	696	December 12, 2025
What is the difference between image generation and output of image tokens? API api , image-generation	2	39	January 17, 2026
Token Usage for Images Remains Constant Regardless of Size - Is This a Bug? API	6	32034	September 23, 2024
Help understand token usage with vision API API gpt-4-vision	7	4978	February 12, 2025

Clarification on Token Usage for Image Inputs

How is pricing calculated for images?

Discovery

detail:low, 512x513

detail:high, 512x513

Related topics