Today we’re releasing the latest model in our reasoning series, OpenAI o3-mini, and you can start using it now in the API. o3-mini can outperform o1 in coding and other reasoning tasks, and is 93% cheaper and has lower latency. It supports function calling, Structured Outputs, streaming, and developer messages. You can also choose between three reasoning effort options—low, medium, and high—to optimize for your specific use cases. This flexibility allows o3-mini to “think harder” when tackling complex challenges or to prioritize speed. In addition to the Chat Completions API, you can use o3-mini in the Assistants API and Batch API today. Similar to o1, o3-mini comes with a larger context window of 200,000 tokens and a max output of 100,000 tokens.
o3-mini is our first small reasoning model that supports these developer features out of the gate, making it production-ready today. Keep in mind that prompting our reasoning models differs from our GPT-series, so we’ve created a guide to help you get the best results. With its cost efficiency and speed, we’re excited to see how you integrate o3-mini, particularly in agentic and coding use cases.
It did for me…I was able to get a run through on Assistants with o3-mini-2025-01-31 selected in the Playground, with several tools on it.
One run came back (quick!) with no tool calls needed (wasn’t needed). Specifically wrote a response that I knew triggers one of my tools (without naming directly), and it called the tool perfectly in the Playground (quick as well!).
Was able to get a local ChatCompletion to work with same model.
For some reason, it keeps failing for Assistants API local - same id as above - but I’m sure that’s something I have setup that is contradictory to running o3 through Assistants, just not sure what it is quite yet.
I’ve noticed that in the web version there are two O3 models: one labeled “high” and another without any prefix. However, the release notes mention three versions: low, medium, and high. Could someone clarify which version is used in the interface—is it O3-low or O3-medium?
No “thinking” metadata in the streaming, for progress and keep-alive? Is this just not to be included as a feature, or is it still a possibility? (like ChatGPT).
Bonus: Check “Data Controls → Sharing” in your API organization “profile”.
(Sharing may benefit API specialist applications in the future, perhaps a source of RLHF with a different pattern than “I’m ChatGPT”, but one wonders if the models being unsuccessful is also of use.)
Thanx for you answer!
As I understand it, the usage limits within the Plus plan are specified in total for both models – the medium and the high one? Or is there a separate limit for each model? Thank you.
Ah, are you referring to the ChatGPT Plus plan? This post and forum are about the API and developing with the o3-mini model on the OpenAI Platform, so I don’t have an immediate answer— but I believe it’s the same as o1, 50 messages/a week. Sorry to redirect you, but you can get in touch with the ChatGPT support team and refer to the ChatGPT Help Center here: https://help.openai.com/en/?q=contact.
How soon after becoming a tier-3 organization does o3-mini become available? I hit tier-3 this morning (~6 hours ago) but still see
{'error': {'message': 'The model `o3-mini` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}
I used o3 mini today to write a prompt for o3 mini-high to write a Python app using flask to search patents. I should say rewrite, 45 seconds roughly. It was fast enough I could hardly scroll to keep up with it. If anyone wants to see what it did in 45 seconds. I made a video but apparently we cant post links here (understandable).