Responses API: How to identify the exact underlying image generation model for precise internal billing?

pesco · April 30, 2026, 9:14am

I’m currently working on a SaaS application where I need to implement a precise internal billing system, deducting credits from my users based on their exact token consumption.

To get accurate usage metrics (especially to track prompt_cache_hit_tokens for input and exact completion_tokens for output), I migrated from the standard Image API (images/generations) to the new Responses API.

The integration works flawlessly, but I’ve hit a roadblock regarding cost calculation.

The documentation states: “The Responses API image generation tool uses its own GPT Image model selection.” While the Response payload correctly provides the token counts in the usage object, it doesn’t seem to expose the name of the underlying image model that was actually used.

Since output token pricing varies drastically between models, multiplying the completion_tokens by the right price is impossible without knowing the exact model.

My questions are:

Is there a strict, documented mapping between the driver LLM and the image model? For instance, is it guaranteed that gpt-5-mini will always trigger gpt-image-1-mini, and gpt-5.5 will always trigger gpt-image-2?
Is there a way to extract the exact image model name directly from the Response object payload? (I checked the SDK and couldn’t find a parameter for it).
If this is currently a “black box”, are there any plans to expose the underlying image model in the usage block or metadata in future updates? Precise cost attribution is vital for developers building user-facing applications.

Thanks in advance for any insights or official clarifications!

_j · April 30, 2026, 5:45pm

The API reference for tools gives you that “auto” model selection is your choice, and you can select the specific model that shall be used.

It also states that gpt-image-1 is the default model, but the API reference does not yet even include gpt-image-2.

Costs are another matter entirely. An ImageGenerationCall output item or event does not provide any usage or model information, nor billing, nor the cost of image inputs that were used as image generation context automatically from the chat. Billing is at the image model’s cost, so wouldn’t be in “usage details”. By “chat” with an image model, you are basically not caring that a response could cost you $1.00+ if this tool is invoked by the AI, as there is obfuscated information about how it even works for collecting billable input context.

The API reference documentation page is a formatting mess currently. Here are parameters that can be passed to “tools” when you include image_generation tool, alphabetically, in an unfortunate wide table on this forum that can’t go wide, with a schema that is also reused for response output echo (in case some don’t seem like inputs).

Image Generation Tool

The image generation tool creates new images or edits existing images using GPT image models.

Use this tool by including a configuration with:

{
  "type": "image_generation"
}

Additional parameters may be provided to control the model, image size, quality, background, output format, editing behavior, and streaming previews.

Basic Example

{
  "type": "image_generation",
  "model": "gpt-image-1.5",
  "action": "generate",
  "size": "1024x1024",
  "quality": "high",
  "background": "auto",
  "output_format": "png"
}

Parameter Reference

Parameter	Required	Accepted Values	Default	Description
`type`	Yes	`"image_generation"`	—	Identifies this as the image generation tool. This value must always be `"image_generation"`.
`action`	No	`"generate"`, `"edit"`, `"auto"`	`"auto"`	Controls whether the tool should create a new image, edit an existing image, or automatically choose the appropriate behavior.
`background`	No	`"transparent"`, `"opaque"`, `"auto"`	`"auto"`	Controls whether the generated image should have a transparent background, an opaque background, or automatic background handling.
`input_fidelity`	No	`"high"`, `"low"`	`"low"`	Controls how closely the output should preserve style, identity, and visual details from input images. Especially relevant for facial features and image edits.
`input_image_mask`	No	See Input Image Mask	—	Provides a mask image for inpainting or targeted image editing.
`model`	No	`"gpt-image-1"`, `"gpt-image-1-mini"`, `"gpt-image-1.5"`, or another supported model name as a string	`"gpt-image-1"`	Selects the image generation model.
`moderation`	No	`"auto"`, `"low"`	`"auto"`	Controls the moderation strictness applied to generated images.
`output_compression`	No	Number	`100`	Controls output image compression. Mainly relevant for compressed formats such as JPEG or WebP.
`output_format`	No	`"png"`, `"webp"`, `"jpeg"`	`"png"`	Sets the file format for the generated image.
`partial_images`	No	Number from `0` to `3`	`0`	Controls how many partial image previews are produced while streaming. Use `0` to disable partial images.
`quality`	No	`"low"`, `"medium"`, `"high"`, `"auto"`	`"auto"`	Controls the quality level of the generated image. Higher quality may increase generation time or cost.
`size`	No	`"1024x1024"`, `"1024x1536"`, `"1536x1024"`, `"auto"`	`"auto"`	Sets the output image dimensions. Use `"auto"` to let the system choose.

Parameters in Detail

`type`

Identifies the tool configuration as an image generation request.

Required value:

"type": "image_generation"

This parameter is required.

`action`

Controls whether the tool generates a new image, edits an existing image, or decides automatically.

Accepted values:

Value	Meaning
`"generate"`	Create a new image from the prompt or instructions.
`"edit"`	Modify an existing input image.
`"auto"`	Let the system choose between generation and editing behavior.

Default:

"auto"

`background`

Controls the background style of the generated image.

Accepted values:

Value	Meaning
`"transparent"`	Generate an image with transparency where supported.
`"opaque"`	Generate an image with a solid, non-transparent background.
`"auto"`	Let the system choose the appropriate background handling.

Default:

"auto"

`input_fidelity`

Controls how closely the output should preserve details from supplied input images.

Accepted values:

Value	Meaning
`"high"`	Stronger preservation of input image details, style, and features. Useful when editing faces, likenesses, or specific visual identities.
`"low"`	Looser preservation of input details. Allows more variation from the input image.

Default:

"low"

Model support:

Supported by gpt-image-1
Supported by gpt-image-1.5
Not supported by gpt-image-1-mini

`input_image_mask`

Provides a mask image for inpainting or targeted editing.

The mask can be supplied either as a file ID or as a base64-encoded image string.

Example using a file ID:

{
  "input_image_mask": {
    "file_id": "file_abc123"
  }
}

Example using a base64-encoded image:

{
  "input_image_mask": {
    "image_url": "data:image/png;base64,..."
  }
}

Subfields:

Field	Required	Accepted Values	Description
`file_id`	No	String	ID of a previously uploaded mask image file.
`image_url`	No	String	Base64-encoded mask image.

Notes:

Use input_image_mask when only part of an image should be edited.
The mask identifies the area to modify during inpainting.
At least one of file_id or image_url should be provided when using a mask.

`model`

Selects the image generation model.

Accepted values include:

Value	Description
`"gpt-image-1"`	Default image generation model.
`"gpt-image-1-mini"`	Smaller image generation model. Some advanced features may not be supported.
`"gpt-image-1.5"`	Newer image generation model with support for advanced image features.
Any other supported model name as a string	Allows specifying another compatible image model if available.

Default:

"gpt-image-1"

`moderation`

Controls the moderation level used for image generation.

Accepted values:

Value	Meaning
`"auto"`	Use the default moderation behavior.
`"low"`	Use a lower moderation level where available.

Default:

"auto"

`output_compression`

Controls the compression level of the output image.

Accepted value:

Number

Default:

Notes:

This is most relevant for compressed output formats such as "jpeg" and "webp".
Higher values generally mean less compression and higher image quality.
Lower values generally mean more compression and smaller file size.

`output_format`

Sets the output image file format.

Accepted values:

Value	Meaning
`"png"`	PNG image output. Useful for lossless images and transparency.
`"webp"`	WebP image output. Useful for compressed web images.
`"jpeg"`	JPEG image output. Useful for photographs and compressed images without transparency.

Default:

"png"

`partial_images`

Controls how many partial images are generated while streaming.

Accepted value:

0, 1, 2, or 3

Default:

Meaning:

Value	Meaning
`0`	Do not generate partial image previews.
`1`	Generate one partial image preview.
`2`	Generate two partial image previews.
`3`	Generate three partial image previews.

`quality`

Controls the image quality level.

Accepted values:

Value	Meaning
`"low"`	Lower quality, generally faster or less expensive.
`"medium"`	Balanced quality.
`"high"`	Higher quality, generally slower or more expensive.
`"auto"`	Let the system choose the appropriate quality level.

Default:

"auto"

`size`

Sets the output image dimensions.

Accepted values:

Value	Orientation
`"1024x1024"`	Square
`"1024x1536"`	Portrait
`"1536x1024"`	Landscape
`"auto"`	Automatically selected

Default:

"auto"

Compact Example: Generate a Square PNG

{
  "type": "image_generation",
  "action": "generate",
  "model": "gpt-image-1",
  "size": "1024x1024",
  "quality": "auto",
  "output_format": "png"
}

Compact Example: Edit an Image With High Input Fidelity

{
  "type": "image_generation",
  "action": "edit",
  "model": "gpt-image-1.5",
  "input_fidelity": "high",
  "size": "1024x1024",
  "quality": "high",
  "output_format": "png"
}

Compact Example: Use a Mask for Inpainting

{
  "type": "image_generation",
  "action": "edit",
  "model": "gpt-image-1.5",
  "input_image_mask": {
    "file_id": "file_abc123"
  },
  "size": "1024x1024",
  "quality": "high",
  "output_format": "png"
}

pesco · May 6, 2026, 7:18am

Hi _j,

Thank you so much for the incredibly detailed response. It completely clarifies the situation, even if it confirms my worst fears.

The fact that the Responses API silently passes the entire chat context to the image model—and bills for those vision/input tokens without explicitly reporting them anywhere in the usage object is a massive dealbreaker .

Do you (or anyone else closely following the developer updates) foresee OpenAI updating this? Is there any buzz or roadmap indicating that they will eventually expose the exact, itemized tool invocation costs directly in the API response payload?

Thanks again.

_j · May 6, 2026, 7:38am

The main thing here: OpenAI gives a “cached” price for the image model, yet never delivers a cached discount on the generate or edits images API.

It is just as likely they never give a discount when images are generated by the Responses API tool when you “chat with an image buddy”, your reasoning for exploring Responses. You have such poor auditing and apparent obfuscation available, you’d have to set up a separate project just to make some deterministic sequence of calls to even find costs to report an error.

The logical place to return the usage would be in the internal tool call event you have returned as output, just as you can get the code the AI wrote. If streaming and live, you can monitor that tool call result and say, “that’s enough money spent on images for you”.

I don’t foresee any change. There’s no “bug report” format being heard here, where you can say, “we must have costs in order to even consider this product, so we can bill, for what you give to free ChatGPT users”.

No tool tells you how much it is costing you, whether you got charged more “code” in 20 minute increments for containers, vector store token placement individually, now vector store search fees, internet search fees, etc. It is hard to think this opacity is extended oversight - they did the same on Assistants when its costs for retrieval were astronomical, even degrading the usage page the same day of release.

The thing you will note at least is that you have some semblance of control over what the image output might be or model you specify. However, that doesn’t work for logical constraints within reasonable variation (such as varying aspect ratio) for arbitrary “chat” images by talking to the AI. You’d need a bunch of user interface controls or just one image type only. Your imagination is limited to their imagination.

Topic		Replies	Views
How is pricing calculated when using /v1/responses with gpt-image-1? API gpt-image-1 , responses-api	7	239	April 24, 2026
Responses API Image Generation Token Usage API api-usage , gpt-image-1 , responses-api	4	887	April 24, 2026
How to calculate cost for gpt-image-1 when using the Responses API? API	2	275	April 24, 2026
Gpt-image-1 collected pricing information - and why Responses is undocumented API pricing , gpt-image-1	2	1937	July 27, 2025
Prepaid $10 for gpt-image-1, generated lots of images via API, but usage + costs still show $0 API api , api-usage , gpt-image-1	5	328	November 16, 2025

Responses API: How to identify the exact underlying image generation model for precise internal billing?

Image Generation Tool

Basic Example

Parameter Reference

Parameters in Detail

type

action

background

input_fidelity

input_image_mask

model

moderation

output_compression

output_format

partial_images

quality

size

Compact Example: Generate a Square PNG

Compact Example: Edit an Image With High Input Fidelity

Compact Example: Use a Mask for Inpainting

Related topics

`type`

`action`

`background`

`input_fidelity`

`input_image_mask`

`model`

`moderation`

`output_compression`

`output_format`

`partial_images`

`quality`

`size`