Gpt-image-1 collected pricing information - and why Responses is undocumented

A 4k image would cost more, not because it is 3840x2160, but rather, because it is rectangular.

  • A perfect square of any large size will be downsized on gpt-image-1 to a 512x512 tile.
  • A rectangle that is under twice as tall or twice as wide would consume two tiles, with the shorter dimension being downsized to 512px.

Thus, whether HD at 720p, 1080p or 4k, the input would result in a “512p” video resolution, 910x512 - two tiles of input image billing. HD video is one of the examples shown in section 4 with its cost, as “selfie” (because portable devices also often use this aspect ratio). Three input images would cost you just a penny.


The pricing is significantly amplified now if you use the new input fidelity option: that same image would have an additional $0.062 cost - per input image to the edits API, and per-image for images in past chat also when using Responses and gpt-image-1 as a tool.

1 Like