There does not seem to be docs around which version we can call in the API. I’m guessing, the API 03-mini model is just the “high” version?
There is a new parameter called reasoning_effort=
which you can set to high medium or low to set that.
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="o3-mini-2025-01-31",
messages=[
{
"role": "developer",
"content": [
{
"type": "text",
"text": "You are a helpful assistant"
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "why does the moon sometimes appear in the day?"
}
]
}
],
response_format={
"type": "text"
},
reasoning_effort="low"
)
What is the reasoning effort for the o3-mini in ChatGPT? Is it medium or low?
Does the “reasoning_effort” have an impact on the cost or anything else? I don’t want to be blindsided.
Hi @markhallak ! reasoning_effort
has an impact on both latency and cost. Higher effort induces more “reasoning tokens” behind the scenes (which are billed as output tokens), and more reasoning tokens also means slower response time. So the best approach is to start from low effort, evaluate, and then ramp up if needed.
Hi @teng0112 , I’d say this is a different question from OP, since it doesn’t deal with API but ChatGPT. I dunno the answer, but given that ChatGPT is more complex system with many bits and pieces behind the scenes, I wouldn’t be surprised if there is some kind of routing or categorization mechanism that determines who level effort should be applied. But this is purely speculative.
I can add that “o3-mini defaults to “medium” in ChatGPT and the API.”
This is a direct quote from a staff member. I didn’t ask if there is a classifier to dynamically select low or high thinking effort if needed.
Awesome, thanks for the info @vb!