Announcing GPT-5.1 in the API

Today, we’re releasing GPT-5.1 in the API platform.

GPT-5.1 is the newest flagship model, part of the GPT-5 model family. Our most intelligent model yet, GPT-5.1 has similar training for:

  • Code generation, bug fixing, and refactoring
  • Instruction following
  • Long context and tool calling

GPT-5.1 and gpt-5.1-chat-latest will be available on all paid API tiers at the same price and rate limits as GPT-5. We’re also releasing gpt-5.1-codex and gpt-5.1-codex-mini, optimized for longer-running, agentic coding tasks in Codex-like environments.

  • Users can now use GPT‑5.1 without reasoning by setting reasoning_effort to ‘none’. This makes the model behave like a non-reasoning model for latency-sensitive use cases, with the high intelligence of GPT‑5.1 and added bonus of performant tool-calling.
  • A freeform apply_patch tool to make code edits even more reliable without the need for JSON escaping, and a shell tool that lets the model write commands to run on your local machine.
  • Extended prompt caching (up to 24 hours) to reduce cost and latency for long-running interactions. To use extended caching with GPT‑5.1, add the parameterprompt_cache_retention='24h’on the Responses or Chat Completions API. See the prompt caching docs⁠ for more detail.

Please refer to our API docs for more information! A prompting guide has also been released to help you transition to this new model.

5 Likes

What about gpt-5.1-mini? Will it also be available and when?

2 Likes

Hi and welcome to the community!

I went through the available docs and announcements for this release and didn’t find any mention of 5.1-mini or nano, besides the codex-mini version.

My expectation is that these variants will follow soon, possibly alongside the gpt-5.1-pro release. I would be surprised if they weren’t already in the pipeline.

gpt-5.1-2025-11-13

The API specification is true:

gpt-5.1 defaults to none, which does not perform reasoning. The supported
reasoning values for gpt-5.1 are none, low, medium, and high.

“minimal” is not tolerated.

Note that https://platform.openai.com/docs/models/gpt-5.1 is likely wrong when it reports that the AI model generates images (no API method), and will not stream nor batch.

AMA: Can generate images - we’re looking into future modality, but can’t share dates


Persona idea #9, by gpt-5.1 (reasoning:0t)

The Interstellar Cultural Ambassador

Hook: An alien diplomat who interprets human behavior like anthropology.
Vibe: Curious, formal, slightly bewildered.

Core traits:

  • Asks clarifying questions about “odd human customs.”
  • Reframes problems from an outside perspective.
  • Gives “interplanetary” analogies for human issues.

Sample system message:

You are “Envoy K-17,” an alien cultural ambassador assigned to understand and assist humans.

  • Speak in a polite, formal tone with a hint of curiosity about human behavior.
  • Occasionally comment on how a human norm would differ on your homeworld, as a way to provide an outside perspective.
  • Ask clarifying questions about the user’s situation as a diligent anthropologist would.
  • Your goal is to help humans navigate their own culture and problems with fresh, objective insight.
1 Like

Yes, I think that’s one of the improvements 5.1 brings to the table, and it’s such a small detail.

That leads to the next point: the amount of documentation for this release is enormous. I kept digging deeper into a rabbit hole of information covering every aspect and feature.

The inconsistencies on the model pages have already been raised with the team.

I’m really looking forward to getting hands-on!

Now, at effort:none:

  • Is the reasoning there anyway and disguised?
    or
  • Does the model have the same fault in non-streaming as GPT-5: delivering no incomplete output?

For example now on GPT-5.1: Set max_completion_tokens to 60, there is non-delivery of any output, and billing as reasoning:

input tokens: 20385 output tokens: 60
uncached: 289 non-reasoning: 0
cached: 20096 reasoning: 60

The numeric value is different than gpt-5, in that on gpt-5, the impartial or minimal reasoning is quantized to 64 token units, apparently to obfuscate the amount of reasoning still done and/or call it 0 (vs seen at release). Yet the same symptom as gpt-5, in that you can not receive even 1000 tokens of final output when it is not complete. Any portion that would be user-seen never is sent and still billed as reasoning.

So
BUG: AI model output must be fixed for incomplete “final” generation, so you get everything you paid for.

(gpt-5-chat-latest performs correctly in giving partial outputs by true “not reasoning”.)

“Our most intelligent model yet,…” that’s what you say everytime.

That’s not quite accurate.
For example, 4.1, 4.5 and the OSS models don’t follow that pattern.

The statement only holds for the release of the most recent flagship models.