GPT-5.4 Pro and Thinking are here!

Today, we’re releasing GPT‑5.4 in ChatGPT (as GPT‑5.4 Thinking), the API, and Codex. It’s our most capable and efficient frontier model for professional work. We’re also releasing GPT‑5.4 Pro in ChatGPT and the API, for people who want maximum performance on complex tasks.

  • Native computer-use capabilities.
  • Up to 1M tokens of context in Codex and the API.
  • Best-in-class agentic coding for complex tasks.
  • Scalable tool search across larger ecosystems.
  • More efficient reasoning for long, tool-heavy workflows.

GPT-5.4 sets new state of the art results on professional work and computer use, with strong gains on coding and tool use.

  • 83.0% on GDPval
  • 75.0% on OSWorld-Verified
  • 57.7% on SWE-Bench Pro (Public)
  • 54.6% on Toolathlon

GPT-5.4 can write Playwright code, read screenshots, and issue keyboard/mouse actions to operate computers. You can steer its behavior and set custom confirmation policies for different risk tolerances. On OSWorld-Verified, it achieves a state-of-the-art 75.0% success rate.

GPT-5.4 is stronger on coding, especially on longer-running tasks where it can use tools and iterate.
On SWE-Bench Pro, it outperforms GPT-5.3-Codex, while being lower latency across reasoning efforts. In Codex, /fast mode delivers up to 1.5x faster performance across supported models, including GPT-5.4.

GPT-5.4 is designed for complex professional work.
On GDPval, it matches or exceeds industry professionals in 83.0% of comparisons, up from 70.9% for GPT-5.2. On our internal spreadsheet modeling benchmark, it scores 87.3% vs 68.4% for GPT-5.2.

GPT-5.4 makes tool use more capable and efficient.
In the API, tool search lets agents retrieve only the definitions they need, reducing token usage and preserving the cache. Across 250 MCP Atlas tasks, tool search cut token usage by 47% at the same accuracy.

GPT-5.4 is rolling out gradually today across ChatGPT and Codex.
In the API, GPT-5.4 is available now as gpt-5.4, and GPT-5.4 Pro is available as gpt-5.4-pro.

Best practices for working with GPT-5.4

Migrate to GPT-5.4 with this Codex skill:

@sps created a companion post for this release:

13 Likes

https://openai.com/index/introducing-gpt-5-4/

https://openai.com/index/gpt-5-4-thinking-system-card/

GPT-5.4 Model | OpenAI API

4 Likes


SILENTLY DROPPED

LETS GO BOIS LETS GOOOOOOOOOOO, MORE TIME WITHOUT TOUCHING GRASS!

OpenAI give us a codex mobile app and i can die in peace

Also thanks OpenAI you guys have been killing it, DOW issues have been causing problems but for me you guys are literally giving me superpowers since i grabbed chatgpt and started to make it be my mentor, so personally i cant be mad with you guys.

Can’t stress it enough, thank you guys… God bless.

5 Likes

@polepole @af0r

Thanks for the announcement post guys! :heart:

4 Likes

Nice. Looks like a mild upgrade to coding but I’ll take it! Restart your CLI’s! :racing_car:

4 Likes

Monthly releases are here. It is somewhat refreshing not having to wait an indefinite amount of time for the next model to be released.

5 Likes

Yes, yes yes yes yes yes yes yes YEEEESSSSSS!!!

And ESPECIALLY a solution to remote codex cli … with dictation …

2 Likes

laying down in a park bare feet on the grass coding from my phone with codex, thats the one thing missing !

i know they are cooking it tho…

4 Likes

So are we going to see a “codex” variant of this model? I’m assuming we won’t based on the presentation of its coding capability …

3 Likes

The tool_search looks very interesting:

3 Likes

The missing rows of “pricing”:

Model Input Cached input Output
gpt-5.4 (<272k context length) $2.50 $0.25 $15.00
gpt-5.4-pro (<272k context length) $30.00 $180.00
gpt-5.4 (>272k context length) $5.00 $0.50 $22.50
gpt-5.4-pro (>272k context length) $60.00 $270.00

Large context is not a progressive tax: it is a switch to the higher price with input larger than 272000 tokens (not 1024 * 272).

Also, review the “vision” - the image resolution at defaults has a higher cap - 1536 → 2500 “patch” tokens, equivalent to an area 1600x1600px, so you may want to especially consider getting resizing under your control before sending to API.


Additionally: Vision is still getting a 1.2x multiplier to the token consumption, even at this higher price. Otherwise this test would not return identity on the same image when giving gpt-5.4 also the gpt-5.2 multiplier=1.2 in its truth table.

| model              | vision | vision_mult | responses input | calculated |
| ------------------ | ------ | ----------- | --------------- | ---------- |
| gpt-5.4-2026-03-05 |  patch |         1.2 |             526 |        432 |
| gpt-5.2-2025-12-11 |  patch |         1.2 |             526 |        432 |

They have a table of price multipliers for “patches” models on the vision documentation page.
Neither gpt-5.2 nor gpt-5.4 are in it.

4 Likes

Buzzing since gpt-5.4 dropped. Wrote a companion post to highlight the details that matter when you start building with gpt-5.4

4 Likes

Prompt guidance for GPT-5.4

7 Likes

2 posts were split to a new topic: Feedback on GPT-4.1, GPT-4.5 and GPT-5.1

Leader juuuuuuuuuuuuuuhuuuuuuuuuuuu :face_blowing_a_kiss:

2 Likes

Impressive. I still need to take a closer look at the new “desktop work”, possibly through the Agent Mode !? (ah, seems to be now in Codex too ~~~), but even at this stage the model already seems far …. far more capable than the average office worker (> 99,9 %). I have not seen any flaws so far.

We’re are a point (if not already since 5.1) where people who know how to use AI to scale their output could see 100x gains, while the outlook for the average white-collar worker (and beyond) is becoming increasingly uncertain. And this is with a budget AI model.

I’d assume the frontier models have already pushed into levels of mathematical reasoning we can barely even imagine – would be interesting to know what more power is able todo at this point.

1 Like

I’m really hoping for Codex 5.5.

2 Likes

Awesome. Let’s get using!

1 Like

Wow, that’s amazing! Congratulations!

As always, Serge is late a couple of days (weeks, months, years). But this is pretty good stuff. I’ve just gave it a couple of my concept files which normally make opus 4.6 boil apart, but surprisingly, this little beast hold tight and mean. Brilliant analysis despite intentionally weak and lazy prompt. Very promising.

3 Likes