Introducing gpt-image-2 - available today in the API and Codex

gpt-image-2 is OpenAI’s most capable image generation model yet. It is designed for complex visual tasks and produces precise, usable images with stronger editing, better layouts, improved text rendering, and more reliable instruction-following.

Starting today, developers can use gpt-image-2 in the API and in Codex.

This release is built for production workflows, where images need to be accurate, readable, on-brand, localized, formatted for the destination surface, and usable without heavy cleanup.

Independent evaluations already show strong results

Just hours after launch, gpt-image-2 reached the #1 spot across all Image Arena leaderboards, including an unprecedented +242 point lead in Text-to-Image.

What’s new

  • Create assets for any surface
    More aspect ratios and resolutions up to 2K for apps, ads, product flows, social, presentations, and docs.

  • More practical text-heavy visuals
    Stronger structured generation (diagrams, infographics, charts, posters, comics) and improved multilingual text rendering.

  • Better control from prompt to final asset
    More reliable instruction-following, detail preservation, and composition—resulting in more usable outputs.

  • Thinking mode for richer workflows
    With reasoning models, can research, transform inputs, generate variations, and self-check for context-aware assets.

Model capability gallery

The official gpt-image-2 release blog was illustrated entirely with images generated by the model itself.

Below is a selection of images from that post, each highlighting a different capability:

Start creating

Pricing (per 1M tokens)

Modality Input Cached Input Output
Image $8.00 $2.00 $30.00
Text $5.00 $1.25 $10.00

Full details and rate limits are available on the model page.

Use gpt-image-2 in the API for production image generation workflows, or in Codex when you want to create visual assets directly from what you are already building.

We’re excited to see what you build. Share what you create in the forum.

Also worth reading:

Apr 21, 2026 - GPT Image (2) Generation Models Prompting Guide

I just added this to term-llm

Very cool to be able to just generate images from the API with my plan. I wish the API endpoint was directly available, but going through the model seems to work fine.

Sam: OpenAI updated the image generation docs, and if you toggle a tab-like header button “Image API” above code example, you see the “2”. Does this or the curl example as multipart/form-data not work for you?

result = client.images.edit(
    model="gpt-image-2",
    image=[
        open("body-lotion.png", "rb"),
        open("bath-bomb.png", "rb"),
        open("incense-kit.png", "rb"),
        open("soap.png", "rb"),
    ],
    prompt=prompt
)

Currently too disinterested to check it out or code up an arbitrary resolution picker…

Oh I am sure the api works fine, the trouble is on chatgpt Codex plan all we got is:

backend-api/codex/responses the other endpoints don’t work with the oauth key

Anyone know when it will be available on the enterprise tier?

It is coming to Enterprise and Edu soon.

Translated with AI:


Dear Team,

I would like to congratulate your team on the recent improvements in image processing and generation. The progress of this functionality has been remarkably meaningful and opens up a wide range of practical possibilities.

Building on this advancement, I would like to suggest a potential expansion: the ability to work with longer-form videos derived from multiple sequential images.

For example, it would be extraordinarily compelling to allow users to submit a large sequence of images—such as comic book or manga pages containing more than 100 panels—and enable the AI to:

  • Interpret speech bubbles in order to understand the context and construct a coherent narrative script
  • Identify characters, expressions, and settings from the images
  • Generate motion between panels, transforming a static sequence into a fluid animation
  • Produce something akin to an animated adaptation in the style of anime production

Such a feature could represent a major breakthrough for content creators, artists, and even media companies, significantly reducing the effort required to transform visual stories into fully realized animations.

I believe that, given the capabilities already in place, this would be a natural and highly impactful next step.

Thank you for your attention and for the excellent work that has been carried out.


Prezados,

Gostaria de parabenizar a equipe pelas recentes melhorias relacionadas ao processamento e geração de imagens. A evolução dessa funcionalidade tem sido bastante relevante e abre diversas possibilidades práticas.

Aproveitando esse avanço, gostaria de sugerir uma possível expansão: a capacidade de trabalhar com vídeos mais extensos a partir de múltiplas imagens sequenciais.

Por exemplo, seria extremamente interessante permitir que o usuário enviasse uma sequência grande de imagens — como páginas de quadrinhos ou mangás (com mais de 100 quadros) — e a IA fosse capaz de:

  • Interpretar os balões de fala para compreender o contexto e construir um roteiro coerente
  • Identificar personagens, expressões e ambientes a partir das imagens
  • Gerar movimento entre os quadros, transformando a sequência estática em uma animação fluida
  • Produzir algo semelhante a uma adaptação animada, no estilo de produção de animes

Essa funcionalidade poderia representar um grande avanço para criadores de conteúdo, artistas e até empresas de mídia, reduzindo significativamente o esforço necessário para transformar histórias visuais em animações completas.

Acredito que, com os recursos atuais já implementados, essa evolução seria um passo natural e de grande impacto.

Agradeço pela atenção e pelo excelente trabalho que vem sendo desenvolvido.

Your ideas are outstanding. However, generating motion (e.g. fluid animation) would, no doubt, require massive amounts and expensive compute which was why Sora was scuttled.

Is there a way to get higher rate limits than whats listed in the rate limit page? I have 5k RPM with Gemini nano banana 2, but the highest rate limit for gpt-image is 250 IPM, which is 20x difference. Thanks!

You make a good point. Where the R in RPM can mean the difference between 500 and 20000 tokens, and image models are purely multimodal AI language model with DALL-E being shut off, the rate limit for image API should be more intelligent - and have image count limitation simply removed.

Is the thinking mode the same as making it an image tool for a reasoning model?

That must be what that mention of “thinking mode” refers to, but I think it would be specifically about ChatGPT when you use “thinking” instead of “instant”.

ChatGPT gives web search that can return images, but I don’t think that is part of the web search tool product return context on API - not that OpenAI provides any transparency about the web tool return (that you are getting loaded as tokens you cannot audit). You’d have to build you own tool that can now provide images as a tool response, only on Responses.