How is pricing calculated when using /v1/responses with gpt-image-1?

lemonleon · September 15, 2025, 2:32pm

Hi, when I use /v1/responses endpoint to call gpt-image-1 tools, how can I calculate the exact cost for my request?

As my mind, there should be both TEXT cost and IMAGE cost, but the response json only shows the usage with input and output tokens, no mentioned img.

As follows.

Request Json

{
    "model": "gpt-4.1-mini",
    "input": "Generate an image of gray tabby cat hugging an otter with an orange scarf",
    "tools": [
        {
            "type": "image_generation"
        }
    ]
}

Response Json

{
  "id": "resp_0180d3b13d71b7a60068c5895175fc819682fdc45be47e8c3a",
  "object": "response",
  "created_at": 1757776209,
  "status": "completed",
  "background": false,
  "error": null,
  "incomplete_details": null,
  "instructions": null,
  "max_output_tokens": null,
  "max_tool_calls": null,
  "model": "gpt-4.1-mini-2025-04-14",
  "output": [
    {
      "id": "ig_0180d3b13d71b7a60068c589526ff88196987aebdd12e1230d",
      "type": "image_generation_call",
      "status": "completed",
      "background": "opaque",
      "output_format": "png",
      "quality": "high",
      "result": "[base64-encoded image data]",
      "revised_prompt": "A gray tabby cat hugging an otter. The otter is wearing a bright orange scarf. The scene is cute and heartwarming, with both animals showing a friendly and affectionate gesture. The background is simple and soft to highlight the animals.",
      "size": "1024x1024"
    },
    {
      "id": "msg_0180d3b13d71b7a60068c5897bdc9c81968b01b524128bfab0",
      "type": "message",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "annotations": [],
          "logprobs": [],
          "text": "Here is an image of a gray tabby cat hugging an otter wearing an orange scarf. If you need any changes or another image, feel free to ask!"
        }
      ],
      "role": "assistant"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": null,
  "prompt_cache_key": null,
  "reasoning": {
    "effort": null,
    "summary": null
  },
  "safety_identifier": null,
  "service_tier": "default",
  "store": true,
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    },
    "verbosity": "medium"
  },
  "tool_choice": "auto",
  "tools": [
    {
      "type": "image_generation",
      "background": "auto",
      "moderation": "auto",
      "n": 1,
      "output_compression": 100,
      "output_format": "png",
      "quality": "auto",
      "size": "auto"
    }
  ],
  "top_logprobs": 0,
  "top_p": 1.0,
  "truncation": "disabled",
  "usage": {
    "input_tokens": 2285,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 96,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 2381
  },
  "user": null,
  "metadata": {}
}

As we can see, the usage.input_tokens is 2285 and output_tokens is 96. There is an image made by gpt-image-1, but no info about that cost. So, how can I do?

I would appreciate any kind reply, best regards!

_j · September 15, 2025, 3:23pm

You cannot. The AI may do any number of things based on an input, and the context of images sent to the image model is undocumented.

“I’m sorry, but I can’t make that image” is relatively cheap.

Marty_Sullivan · December 21, 2025, 3:02am

Wouldn’t it be simple for the output image token count to just be added to the image_generation_call tool output? Seems like that would be the most straightforward solution

_j · December 21, 2025, 3:58am

Great idea!

Like were you to make an API call to one of the new models via an images endpoint:

{'model': 'chatgpt-image-latest', 'prompt': 'If tuna fish could talk', 'size': '1024x1024', 'timeout': 240, 'user': 'image-editor-user', 'output_format': 'png', 'quality': 'medium', 'background': 'opaque'}

Then in the JSON that contains the b64_data of your image, you might also have “usage”?

{'input_tokens': 11, 'input_tokens_details': {'image_tokens': 0, 'text_tokens': 11}, 'output_tokens': 1470, 'total_tokens': 1481, 'output_tokens_details': {'image_tokens': 1056, 'text_tokens': 414}}

See the API reference image generation response object and see if that meets your needs: https://platform.openai.com/docs/api-reference/images/object

Nothing to observe if you use OpenAI’s hosted Responses API tools, even within response.image_generation_call.completed or any “usage” tokens outside of chat model context - but nothing stops you from making your own function for image generation, also controlling what goes “in” and those input image costs taken from the entire conversation’s “vision”. Then you might even have a bit of control over the user prompting and jailbreaking your app to make a dozen images.

Marty_Sullivan · December 21, 2025, 5:33am

@_j for proxies like LiteLLM, it would just be great if this info could be added to the managed tool output. For example, if you use Google’s Nano Banana models for multimodal output, it does include output image token count, and so the proxy (or whatever downstream application) is able to monitor costs per API-request.

Having to build a separate custom function / tool defeats the purpose of the OpenAI / Azure OpenAI provided tools and makes multimodal output much more challenging.

_j · December 21, 2025, 5:45am

And therein lies the difference: OpenAI does not let you “chat” with an AI model that can natively generate images. On the images endpoint, your “prompt” text is likely still containerized in a task-based message of what the AI is supposed to do, besides image-only fine-tuning.

An image tool report would need a more encompassing internal collector of costs, and perhaps even a “tool usage” object not envisioned yet (which could also report directly on file search fees, auto code containers, etc).

Marty_Sullivan · December 21, 2025, 6:32am

@_j not sure what you’re arguing here. The underlying “tool” is just calling gpt-image-*, which does enumerate and calculate image token input and output, therefore it could quite easily return usage along with the tool output. Just in how it is returning plenty of other tool-specific information just like "background": "opaque", "output_format": "png" why not also usage? It seems like a simple api output update for OpenAI to make.

Topic		Replies	Views
Responses API Image Generation Token Usage API api-usage , gpt-image-1 , responses-api	3	696	December 12, 2025
How to calculate cost for gpt-image-1 when using the Responses API? API	1	173	November 12, 2025
How Much Does OpenAI Image Editing Cost Compared to New Image Generation? API image-generation	13	2551	June 1, 2025
Gpt-image-1 vs GPT Image 1 API	3	206	November 10, 2025
Need help in understanding the pricing of image generation using gpt-image-1 through api API gpt-image-1	1	283	August 6, 2025

How is pricing calculated when using /v1/responses with gpt-image-1?

Related topics