GPT-5.1-Codex-Max is now available in the API

We recently released GPT-5.1-Codex-Max, our most intelligent coding model, in Codex. Today, it’s available in the API and many of your favorite coding tools.

GPT‑5.1-Codex-Max is purpose-built for agentic coding and is built on an update to our foundational reasoning model, which is trained on agentic tasks across software engineering, math, research, and more. GPT‑5.1-Codex-Max is faster, more intelligent, and more token-efficient than its predecessors.

Read more in our blog post and use our prompting guide to begin working with the model.

8 Likes

Please describe the nature of context compaction when on the API, and its operation in the case of truncation: auto vs none.

Then why there is no truncation or maximum turn budget control, besides the maximum model input, offered anywhere in Responses with stateful conversations, especially given that an unexposed value must be set for “auto” to operate and for “compaction” to operate.

1 Like

This is a great question, thank you for raising it! Truncation and compaction are two different things, but I can see how the story is a bit confusing. Let me try and clarify the difference below:

When a user sets truncation: “auto”, the Responses API will automatically drop older context. There is a simple heuristic here, but the mental model is that it is just evicting old context in a not very smart way. It does no rely on the new compact endpoint at all.

Compaction is an explicit, smart context management operation. It returns a new compacted context window which will carry forward prior state using far fewer tokens (assuming your context window was largely filled with assistant & tool call tokens).

We’d recommend trying to use the compaction endpoint instead of ever falling back on auto truncation. There’s more coming in this area though, so stay tuned! Excited to hear feedback & iterate.

5 Likes

That’s amazing!
When can we get gpt-5.1-max`?

GPT-5.1 already uses more tokens on “high” with hard questions than GPT-5, almost double, yet lower on easier stuff or lower reasoning selection. So…now, with the current model? Thinking less about code for still good quality, as -max is purported to do, might not be generally applicable.

The 12x cost of a real “max” would be gpt-5-pro.


So: there is no included or “auto” compaction intrinsically in the model or in Responses, like might be seen in demos of ChatGPT’s codex.

You have to send a full input context to an endpoint, and it seems pay for an extra turn of a reasoning model, which will be taking that large input (likely simply by prompting and not tuning, as you can specify a variety of models to be used), to make a large encrypted content that can be received but not inspected for quality. Only “believe us, get platform-locked”.

If paying for merely a prompt and trickery, why would I not prompt an AI myself and get clear content I can observe and further truncate, or run the expiring chunk of messages to receive what is still discrete manageable messages in my own summary format? One wonders…

That also disagrees with “happens by magic” demonstrated in a long console video - and cannot mimic that chatgpt+codex pattern when the AI is going off internally in Responses. For example, figure I supply 250k of 272k max input; then send the AI off doing internal reasoning and adding more and more. I might have no interruptable turn in the analysis channel before it exceeds its own context window, and if “truncation:auto” is still my only alternative, then every iteration becomes a total cache miss… the only place to “catch” and “compact” is live when in functions or “patches” tool" calls.

That’s great!

I mean will we have a fromgpt-5.1to gpt-5.1-max like from gpt-5.1-codex to gpt-5.1-codex-max

How about if OpenAI were to just increment the general AI model version with one that is smarter and competitive, sooner than you expect? :laughing:

Given how OpenAI has been over the past couple of years, I’ve pretty much stopped fantasizing that they’ll suddenly drop a much smarter general model one day.
This whole “one model can do everything” direction looks more like taking the easy way out than actually getting stronger.
They’d be better off learning from approaches like DeepSeek Math v2 and building serious specialist math models, so people can actually compare and see the difference!

I just started coding using Chatgpt 5.1, can you explain me how to use Codex properly.

1 Like

Hi and welcome to the community!

Your question is quite broad, so the Codex guide examples are a great place to start. Getting up and running is straightforward, and those resources will likely answer many of your initial questions.

And of course, you are always welcome to ask for help or share your own insights here in the community.

1 Like

If you are interested in trying it, Its on lmarena (code arena, last option on the dropdown).

This update sounds really exciting for anyone who builds software. The new model feels faster and smarter, which is great for everyday coding work. I also like how the tools and website make it easy to explore and try the model — we’ve seen similar benefits when working on projects through our own website neefox, . Looking forward to seeing how the community puts this to use.