Our most powerful reasoning models
o3 and o4-mini are now available in the API. o3 achieves leading performance on coding, math, science, and vision—it tops the SWE-Bench Verified leaderboard with a score of 69.1%, making it the best model for agentic coding tasks. o4-mini is our faster, cost-efficient reasoning model.
While they’re available in both the Chat Completions and Responses APIs, for the richest experience, we recommend the Responses API. It supports reasoning summaries—the model’s thoughts stream while you wait for the final response—and enables smarter tool use by preserving the model’s prior reasoning between calls.
o4-mini is available to developers on tiers 1–5, and o3 is available to developers on tiers 4–5. Developers on tiers 1–3 can gain access to o3 by verifying their organizations. Reasoning summaries and streaming also require verification.
For these models, we’re introducing Flex processing—significantly cheaper per-token prices for longer response times and lower availability. Flex processing helps you optimize costs even further when using these models on non-urgent workloads such as background agents, evals, or data pipelines.
Developer-first models: GPT-4.1
We launched GPT-4.1, GPT-4.1 mini, and GPT-4.1 nano in the API, trained for developer use-cases related to coding, instruction following, and function calling. They also have larger context windows—supporting up to 1 million tokens of context—and are able to better use that context with improved long-context comprehension.
Codex CLI
Meet Codex CLI—an open-source local coding agent that turns natural language into working code. Tell Codex CLI what to build, fix, or explain, then watch it bring your ideas to life. Codex CLI works with all OpenAI models, including o3, o4-mini, and GPT–4.1. Watch the demo.