New reasoning models: OpenAI o1-preview and o1-mini

I’m Nikunj, PM for the OpenAI API. I’m pleased to share with you our new series of models, OpenAI o1. We’ve developed these models to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math.

There are two models:

  • Our larger model, o1-preview, which has strong reasoning capabilities and broad world knowledge.
  • Our smaller model, o1-mini, which is 80% cheaper than o1-preview.

Developers on usage tier 5: you can get started with the beta now. Try both models! You may find one better than the other for your specific use case. Both currently have a rate limit of 20 RPM during the beta. But keep in mind o1-mini is faster, cheaper, and competitive with o1-preview at coding tasks (you can see how it performs here).

Developers on tiers 1-4: these models aren’t available in the API for your account while we’re in this short beta period. We’ll be in touch over email as we expand access.

ChatGPT Plus subscribers can also try o1 today.

You can read more about these models in our blog post. Keep in mind OpenAI o1 isn’t a successor to gpt-4o. Don’t just drop it in—you might even want to use gpt-4o in tandem with o1’s reasoning capabilities. I’m excited to see how you add reasoning to your products!

58 Likes

Extremely impressed so far with domain-specific knowledge, absolutely incredible!

Will be testing much more in the morning. DevDay is gonna be awesome!

6 Likes

Will image or file upload be enabled or is this a limit?

8 Likes

Just a heads-up to any devs migrating to ‘o1’ models via API, it seems like the first message’s role can no longer be ‘system’. Traditionally, we started each completion with [systemMessage, userMessage], but we are now doing [userMessage, userMessage] to get ‘o1-preview’ working.

8 Likes

No other widely available model has been able to answer this question before. o1 nailed it in 23 seconds. Pretty amazing.

12 Likes

Thank you for sharing this heads up! Yes, system messages are not supported during the beta phase, it’s a known limitation at the moment.

You can read more about limitations here as you build with o1 using the Chat Completions API: https://platform.openai.com/docs/guides/reasoning/beta-limitations

Thanks!

8 Likes

Just took for a spin via API, definitely a more thoughtful and comprehensive response to some of my more complicated tasks that usually require I break up into smaller tasks. Would LOVE to see the Chain of Thought but don’t see a way can through API, just the reasoning token count.

2 Likes

Regarding the API:

Is there a risk that concealing the model’s internal “processing” or “thinking” could set a precedent for other LLM providers?

Currently, we have to trust that the output tokens are generated fairly, without any transparency into the process. We’re only given the input and the final output (A and Z), without insight into what happens in between.

If this becomes the norm for LLMs that incorporate “thinking” or “reflection,” anyone integrating these services could face significant challenges in troubleshooting when something goes wrong.

3 Likes

Yes. I think it’s much safer to assume that employees are prompt engineering the heck out of various stages of 4o performance all the time.

I tried some questions from simple bench, and it got them both correct. Very impressive.

(Ice Cubes in a Frying Pan)[ChatGPT]

(There are no more cookies)[ChatGPT]

3 Likes

The reasoning engine we built does the same internal thinking process but in the case of our engine we can potentially generate 100’s of thousands of intermediate thinking tokens. We hide these tokens as well for I’m assuming similar reasons as OpenAI. They’re really only meaningful to the LLM.

I have a training mode I can put our engine in that dumps everything to a stream of jsonl files which are potentially useful for fine tuning. I use these files to debug some but to be honest there’s not a lot you can do in response to them. They’re really only meaningful to the model. They can give you hints about how you can potentially improve your prompt but they’re hints at most.

4 Likes

For many general users it might be practical as an option to hide the underlying processing, as most people don’t require access to this level of detail.

However, for those of us integrating AI into our own solutions, troubleshooting issues often demands deeper insight. In programming, even though most users won’t dig into debug logs, there are always some who need to investigate and understand what’s happening under the hood.

Even for us who want to create our own models and advance in machine learning through our own theories.

In the context of AI, the model’s “processing” is critical—it shows how the output was generated from the input. Regardless of how many users need this, the opportunity to access such information should be available for those who do.

At the very least it would be fair as a toggle able option, considering that ultimately we are paying for the tokens.

6 Likes

One issue they would need to address is how they stream these to you. They would have to probably do it as a new message type

1 Like

what’s interesting is that I got the same correct answer for the Ice Cube question but when I shift the format of the output to where it has to answer using HTML it gets it wrong.

What this highlights is that it’s not actually reasoning it’s way to an answer. It’s just memorizing answers…

4 Likes

This is really exciting. Basically it is just a LLM fine-tuned for extended thinking aloud, but this opens many possibilities.

I share the concern of previous posters regarding the limited explainability of model responses when the chain of thought is hidden. But in ChatGPT, we currently can see the internal thinking process of o1. Is this actually filtered and paraphrased by another LLM or does this limitation only apply to the API? What information is tried to hide exactly?

FWIW, of course even o1 can be led to reasoning abilities pretty quickly. I asked it to divide two numbers in nonal system without converting it to decimal and basically it stopped its inner monologue prematurely and gave me a wrong answer: https://chatgpt.com/share/66e3a3dc-2808-8004-a6b7-15670d6356de

Apparently one of the hardest logical problems, solved by o1-preview but not sufficiently by 4o:

Three gods A, B, and C are called, in no particular order, True, False, and Random. True always speaks truly, False always speaks falsely, but whether Random speaks truly or falsely is a completely random matter. Your task is to determine the identities of A, B, and C by asking three yes–no questions; each question must be put to exactly one god. The gods understand English, but will answer all questions in their own language, in which the words for yes and no are da and ja, in some order. You do not know which word means which.

o1-preview took 83 seconds, but what an amazing answer!

To solve this intricate puzzle, we’ll break it down step by step. The goal is to determine the identities of the three gods—A, B, and C—who are either True, False, or Random. We’ll do this by asking three carefully crafted yes–no questions, each directed to one god. The gods answer in their own language, using the words “da” and “ja”, but we don’t know which word means “yes” or “no.”


Understanding the Challenge

  1. Gods’ Behavior:

    • True always tells the truth.
    • False always lies.
    • Random answers randomly with “da” or “ja.”
  2. Constraints:

    • We don’t know which words mean “yes” or “no.”
    • We must identify each god using only three yes–no questions.
    • Each question is directed to only one god.
    • Questions can depend on previous answers.

Strategy Overview

  • Step 1: Identify a god who is not Random.
  • Step 2: Determine the meaning of “da” and “ja.”
  • Step 3: Use the information from Steps 1 and 2 to identify the gods.

Step-by-Step Solution

Step 1: Find a God Who Is Not Random

Question 1: Ask God A:

“If I asked you ‘Is God B Random?’, would you say ‘da’?”

Analysis:

  • Case 1: A is Random
    • Since Random’s answers are unpredictable, we cannot rely on the response.
  • Case 2: A is True
    • If B is Random, True would answer “yes.”
      • When we ask True, “Would you say ‘da’?” he truthfully answers “da.”
    • If B is not Random, True would answer “no.”
      • He would truthfully respond with “ja.”
  • Case 3: A is False
    • If B is Random, False would lie and say “no.”
      • When asked, “Would you say ‘da’?” he lies and says “ja.”
    • If B is not Random, False would lie and say “yes.”
      • He lies about saying “da,” so he answers “da.”

Interpretation:

  • If the answer is “da”, then either:
    • A is True and B is Random, or
    • A is False and B is not Random.
  • If the answer is “ja”, then either:
    • A is True and B is not Random, or
    • A is False and B is Random.

Conclusion:

  • Regardless of the answer, we can conclude that God C is not Random, because only one god is Random.
  • Therefore, we now have at least one god (either A or C) who is not Random.

Step 2: Determine the Meaning of “da” and “ja”

Now, we focus on a god we know is not Random. Let’s assume from Question 1 that God A is not Random (since Random’s answers are unreliable). We’ll use God A for the next question.

Question 2: Ask God A:

“Is ‘da’ your word for ‘yes’?”

Analysis:

  • Case 1: A is True
    • If “da” means “yes”, True answers “da.”
    • If “da” means “no”, True answers “ja.”
  • Case 2: A is False
    • If “da” means “yes”, False lies and answers “ja.”
    • If “da” means “no”, False lies and answers “da.”

Interpretation:

  • If God A answers “da”:
    • Either “da” means “yes” and A is True, or
    • “da” means “no” and A is False.
  • If God A answers “ja”:
    • Either “da” means “no” and A is True, or
    • “da” means “yes” and A is False.

Conclusion:

  • We need additional information to determine both the meaning of “da” and the identity of God A.
  • However, we now have narrowed down the possibilities.

Step 3: Identify All Gods

Using the information from Questions 1 and 2, we can craft the third question to resolve any remaining ambiguities.

Scenario Analysis:

Scenario 1: From previous answers, suppose we deduce that “da” means “yes” and God A is True.

Question 3: Ask God A:

“Is God B False?”

  • Since A is True, he will answer truthfully.
  • If he answers “da” (“yes”), then God B is False, and God C must be Random.
  • If he answers “ja” (“no”), then God C is False, and God B is Random.

Scenario 2: Suppose we deduce that “da” means “no” and God A is False.

Question 3: Ask God A:

“Is God C Random?”

  • Since A is False, he will lie.
  • If he answers “da” (“no”), which actually means “no,” but since he’s lying, the truthful answer is “yes.” So, God C is Random.
  • If he answers “ja” (“yes”), which actually means “yes,” but he’s lying, so the truthful answer is “no,” and God C is not Random.

Final Identification:

  • Based on Question 3, you can definitively assign identities to God B and God C.

Summary of Questions and Logic

  1. Question 1 to God A:

    • “If I asked you ‘Is God B Random?’, would you say ‘da’?”
    • Purpose: Identify a god who is not Random.
  2. Question 2 to God A:

    • “Is ‘da’ your word for ‘yes’?”
    • Purpose: Determine the meaning of “da” and “ja” and the identity of God A.
  3. Question 3 to God A or another known non-Random god:

    • Depending on previous answers, ask about the identity of another god.
    • Purpose: Assign the correct identities to the remaining gods.

Conclusion

By carefully analyzing the answers to these three questions, you can deduce:

  • The meanings of “da” and “ja.”
  • The identity of God A (whether he is True or False).
  • The identities of God B and God C by elimination and logical inference.

Note: This solution hinges on the logical consistency of the gods’ answers and strategically uses their truth-telling or lying nature to unravel the puzzle.

4 Likes

Thanks for your support @osmonov Great advice.

There is one problem though:
how did you conclude that God C is not Random in Step 1

o1-preview after 33 s:
Thank you for pointing out this important detail. You are absolutely correct to question how I concluded that God C is not Random in Step 1. Upon revisiting my previous explanation, I realize that the conclusion was not fully justified. Let me clarify and correct my reasoning.


Re-examining Step 1

Question 1: We ask God A:

“If I asked you ‘Is God B Random?’, would you say ‘da’?”

Analyzing the Possible Scenarios

Since we don’t know which word means “yes” and which means “no,” and because Random answers unpredictably, we need to consider all possibilities carefully.

Case 1: God A is Random

  • If A is Random, their answer is random and doesn’t provide reliable information.
  • Therefore, we cannot draw any conclusions about the identities of God B or God C from this case.

Case 2: God A is True

  • Subcase 2a: God B is Random

    • True would truthfully say “yes” to “Is God B Random?”
    • Since we are asking True if he would say “da,” his response depends on whether “da” means “yes” or “no,” which we don’t know.
    • Therefore, we cannot definitively interpret his answer without additional information.
  • Subcase 2b: God B is not Random

    • True would truthfully say “no” to “Is God B Random?”
    • Again, the mapping of “da” and “ja” to “yes” and “no” is unknown, so we can’t interpret the response conclusively.

Case 3: God A is False

  • Subcase 3a: God B is Random

    • False would lie and say “no” to “Is God B Random?”
    • When we ask False if he would say “da,” his response is influenced by his lying nature and the unknown word meanings, making it ambiguous.
  • Subcase 3b: God B is not Random

    • False would lie and say “yes” to “Is God B Random?”
    • Similar to previous cases, we cannot definitively interpret the answer.

Conclusion from Step 1

  • Key Point: Given the uncertainty of the meanings of “da” and “ja,” and the possibility that God A could be Random, we cannot conclusively determine that God C is not Random based solely on the answer to the first question.
  • Therefore, my initial conclusion that “God C is not Random” was incorrect.

Correcting the Approach

To solve this puzzle effectively, we need a strategy that:

  1. Accounts for the unknown meanings of “da” and “ja.”
  2. Eliminates the randomness factor by directing questions to gods who are not Random.
  3. Uses self-referential questions that yield consistent answers from True and False, regardless of the word meanings.

Revised Step-by-Step Solution

Step 1: Formulating a Question that Bypasses Uncertainties

We need a question that ensures both True and False will give the same answer, and that Random’s unpredictability is minimized.

Question 1 to God A:

“If I asked you ‘Is God B the Random god?’, would you say ‘da’?”

But as we’ve seen, this doesn’t eliminate the ambiguities.

Instead, let’s use a logical construct that has been proven to work:

Question 1 to God A:

“Does ‘da’ mean ‘yes’ if and only if you are True if and only if God B is Random?”

This question is designed to ensure that the answer will be “da” if God A is True and God B is Random, or if God A is False and God B is not Random.

However, this is quite complex and may not lead us directly to the solution.

A better approach is to use the standard method:

Using the Standard Solution

Question 1 to God A:

“If I asked you ‘Is your word for ‘yes’ ‘da’?’, would you say ‘da’?”

Analysis:

  • If God A is True:
    • If “da” means “yes,” then the truthful answer to “Is your word for ‘yes’ ‘da’?” is “yes,” so True would answer “yes,” which corresponds to “da.”
    • If “da” means “no,” then the truthful answer is “no,” so True would answer “no,” which corresponds to “ja.”
  • If God A is False:
    • If “da” means “yes,” the truthful answer is “yes,” but False lies and says “no,” corresponding to “ja.”
    • If “da” means “no,” the truthful answer is “no,” but False lies and says “yes,” corresponding to “da.”

Interpretation:

  • If God A answers “da,” then “da” means “yes” and God A is True, or “da” means “no” and God A is False.
  • If God A answers “ja,” then “da” means “no” and God A is True, or “da” means “yes” and God A is False.

Conclusion from Question 1:

  • Either way, we can determine whether “da” means “yes” or “no,” and whether God A is True or False.
  • Note: This question does not involve God Random, so we avoid the unpredictability.

Step 2: Identifying God B or God C

Now that we know whether “da” means “yes” or “no,” and whether God A is True or False, we can proceed.

Question 2 to the Same God A:

“Is God B Random?”

  • Since we now know whether God A is True or False and the meanings of “da” and “ja,” we can interpret the answer accurately.
  • If God A says “da,” and we know “da” means “yes,” and God A is True, then God B is Random.
  • If God A says “ja,” and we know “da” means “yes,” and God A is True, then God B is not Random.

Question 3 to God C:

  • If we have determined that God B is not Random, then God C must be Random or the remaining identity.
  • Ask God A or God B (whichever is not Random):

“Are you False?”

  • Using the known meanings of “da” and “ja,” and knowing the identity of the god we’re asking, we can interpret the answer to identify the remaining gods.

Final Summary

  • Step 1: Determine whether “da” means “yes” or “no,” and whether God A is True or False, by asking a self-referential question.
  • Step 2: Use the information from Step 1 to find out whether another god is Random.
  • Step 3: Identify the remaining gods by elimination and by asking direct questions, now that we can interpret the answers correctly.

Answering Your Original Question

In my initial explanation, I prematurely concluded that God C is not Random based on the first question, which was incorrect because the information from the first question alone is insufficient to make that determination due to the complexities introduced by:

  • The unknown meanings of “da” and “ja.”
  • The unpredictable nature of Random’s answers.
  • The possibility that God A could be Random, making any conclusion unreliable.

I apologize for the oversight and any confusion it may have caused.


Key Takeaways

  • Always account for all variables in logic puzzles, especially when dealing with unknowns and random elements.
  • Use questions that can be interpreted unambiguously, even with limited information.
  • It’s crucial to verify each step’s conclusions for accuracy before proceeding.

I hope this clarifies how to correctly approach the problem and explains why the initial conclusion about God C was incorrect. Please let me know if you have any more questions or need further clarification.

3 Likes

I used o1 for an hour and wow…

Comparing 4o and o1 is like comparing a Prius to a Ferrari… Both will get you to your destination, the Ferrari just gets you there faster.

1 Like

@traditionals15 It is not a good analogy, looks like a ferrari - 4o - and a RAM 4x4 - o1 - not fast but fitted for road and off-road uses.

1 Like