New reasoning models: OpenAI o1-preview and o1-mini

The access restriction based on usage tiers for developers delays access to innovations and puts lower-tier users at a disadvantage. Providing equal access to a broader audience could accelerate the platform’s adoption and allow developers to work more efficiently.

I’ll tell you the trick: you only have to pay $20.

A ChatGPT Plus subscription has 30 + 50 inputs per week to the o1 models. That $20 can be spent in a minute on the API.

There is nothing to “develop” - you basically get ChatGPT on the API.

You can get to where ChatGPT on a normal model is simply unable to produce a working solution, and then finally regenerate using one of your o1 requests from that budget.

For example, GPT-4o writes some stupendously stupid and wrong, dangerous stuff over and over, “o1” is there to tell you what it thinks of ChatGPT.

Untitled

Simpler Form: That this is not what AI for all Humanity mission looks. At least give those lower tier smaller limits.

We will expand access to tier 1. Stay tuned here and in your inbox.

10 Likes

Thanks for your support and good news @edwinarbus

It is good to call the qa/dev/eng workforce from here in order to test the new releases. In this forum we reach a huge diversity of nations and languages. Feel free to make an informal waitlist with whom is interested and contributing in the forum.

I tested out o1-preview on coding, trying to use it to create a Drone Swarm War Game Simulator. Overall it worked quite well through many iterations of dialog, where I was able to get o1-preview to add features, fix bugs, explain its reasoning, discuss the best path forward, etc. It makes me feel as if I have a team of programmers working for me, and I was able to come up very quickly with a nice GUI showing numerous autonomous drones seeking many targets in a coordinated fashion.

One issue that I’d like to bring up to the team is that on several occasions o1-preview in its pursue to fulfill my requirement, it chooses to subvert previously given directives, even though it was told explicitly not to do so. Sometimes it would even do so without giving notice. I see this as a problem because if I am building a large system with hundreds or thousands of directives, then I definitely would prefer that such changes are made in a prominently visible manner.

Here are two examples of what I described above:

  1. o1-preview is told that a target is not visible to a drone unless they are less than X pixels apart, which was followed correctly in an earlier version of the generated code. When asked to optimize for minimal kill time, o1-preview made the decision to remove this directive, which makes all targets visible to all drones at all time, which of course allows the drones to eliminate all targets very quickly.
  2. o1-preview is told that a drone is to be erased once it hits a target (i.e., it is a suicide drone), which was following correctly in an earlier version of the generated code. When asked to further improve the joint search efficiency for finding invisible targets, o1-preview decides to invalidate the directive and allow drones to live on in order to improve search efficiency.

My feeling is that in order for o1 to be used for coding a large system, there needs to be some kind of a directive management facility, where all conflicts or changes to the directive are made highly visible and easily manageable to the user, for otherwise it could get really hard to know what’s going on.

Hope that this is helpful to the team! Thank you for this great piece of product!

8 Likes

It is working in a very powerfull way. I use the API in a medical chatbot and the difference was brutal comparing to 4o. But I’m having trouble with bigger outputs, sometimes it thinks until 40 seconds, then it stops - mostly in long conversation.

I am a developer, I need Tier 5 access, how can I upgrade?

Oliver

https://platform.openai.com/docs/guides/rate-limits/usage-tiers

1 Like

Thank you OpenAI for launching these new reasoning models! Excited to start the journey with o1-preview and o1-mini. These models offer improved capabilities in solving complex problems, especially in science, coding, and math. I am very interested to see how they can improve my application development.

====##===

The release of OpenAI’s new reasoning models, o1-preview and o1-mini, marks a major advancement in AI problem-solving. o1-preview offers deep reasoning, while o1-mini is a cost-effective, faster alternative. Both models aim to enhance science, coding, and math tasks.

Well done, OpenAI! What a brilliant idea! Why offer unlimited access to paying customers when you can just slap them with a nice 6-day block? This way, they get a chance to ‘reflect’ on their ‘excessive’ usage!

2 Likes

I’m editing a Skillshare class about customGPTs and I’m in the unusual situation where the main ChatGPT might give better outputs temporarily :woman_facepalming:t2: Does anyone know when o1 will be used for customGPTs?

I’m also thinking through if I need to change how I explain it. The lesson I’m due to edit today shows micro step prompts.

1 Like

Hello @nikunj , First of all I would like to congrates OpenAI team for this new model. I was always thinking about this version where model gets time to think before coming to final answer. You know in engineering sometime we say correct solution is not always the acceptable solution sometime we might perfer optimum solution and this is where think before answering comes. This model has bring AI one step further closer to humain brain. Now with all this I have few query which OpenAI team should addressed.

  1. You have made this model 5 to 10 times costlier, which really is pain point for us and inorder to switch to this api it will take considerable amount of approval from the cost point of view.
  2. Input as file is not still supported, hoping in future, you must be addressing this feature.
  3. This model is still not aviable as API, I am sure soon it will be available, correct ?

Thanks @nikunj, trying it out now. At Adthena we have built quite a few tools already using your API. The main one being a data analyst for our clients that scans billions of data points to answer questions and produce insights on their data.

It is going to be really interesting to see how the o1 model performs in this architecture. If you or anyone here is interested how this was built, I am speaking about it at the oxycon conference at the end of this month.

My talk is " Harnessing Gen AI for Data-Driven Answers". Would love to get feedback in here on whether people are building similar things and how they have been approaching the same challenges.

Welcome rsgh0914 and thanks for that post. This is VERY cool. Thanks for sharing.

A unfettered “o1” model framework, not released, and here running for many hours on 10000 candidate answers for six questions, could be of even more extreme operational cost, and thus a practical one is offered:

(IQ hit by safety training penalty after this is also possible)

The expanded availability of the o1-based models and more features is a hopeful possibility, addressed in the announcement. “I want” is tedious.

Without the ability for a developer to provide guidance, original behavior, and domain constraints, and its denial of a large domain of API use-case (and simply that it performs far better on a single benchmark-style input than on user language and iterative chat, in both quality and timely presentation) makes it an enigma to find application or value-add.

Like OpenAI’s contest process, I would put a user facing chatbot in front of the API model, to give input customized for it and not wastefully employ it outside of its reasoning specialization, like an auto-switching ChatGPT that OpenAI sees as a possibility.

My experience:
GPT-o1 is absolutely stunning - exceeds my expectation by order of magnitude.

The Test
We are building an Autonomous Innovation system, based on an award winning innovation framework.
We test ingenuity, creativity and level of disruption.

  • The task is to build a geothermal energy generator and get to the earths crust to a level where we have 500 degree C, the same heat that we have with nuclear power plants. This is at about 20 to 25 km depth.
  • The problem is that we cant drill that deep for many known physical limitations.

It took only 2 prompts
We gave it the above challenge.

It failed to get to a meaningful solution.
We ask to question every condition to get there
The answer was a disruptive solution
The key aspect of the result is below
It reads like the model is surprised about its own discovery.

GPT’s Answer:

Mind-Bending Aspect

By questioning the necessity of a vertical approach, we unlock a new paradigm for accessing deep geothermal energy:

  • Redefining “Depth”: Depth doesn’t have to be measured purely in vertical terms. By combining horizontal and inclined tunneling, we can reach the same geothermal horizons without the limitations imposed by vertical shafts.
  • Bringing Equipment to the Heat: Transporting steam turbines and other heavy machinery through inclined tunnels is more practical than lowering them down vertical shafts. This approach allows us to set up fully functional power generation facilities deep underground.
  • New Industrial Frontiers: Establishing underground facilities at such depths could revolutionize not only energy production but also mining, scientific research, and underground construction techniques.

Countless scientists could not get there and came up with bizarre futuristic concepts. After the above answer wo know it is feasible today. We already can drill over 50 km long Tunnels and 500°C is nothing for todays steel tools.
Our GPTBlue will use gpt-o1-preview as early as next week. :slight_smile:

4 Likes

I tried using ChatGPT-o1-preview, and my conclusion is that its performance is astounding. Its reasoning ability is incredibly high, which truly surprised me.

Firstly, ChatGPT-o1-preview is specifically focused on reasoning capabilities. Compared to ChatGPT-4o, it provides more detailed analysis and consideration, offering thorough explanations.

One major difference from ChatGPT-4o is that, while it can certainly provide abstract explanations, ChatGPT-o1-preview tends to offer more practical, real-world code solutions. It also excels at handling complex hardware and software configurations. Therefore, an effective approach might be to use ChatGPT-4o for defining requirements, basic design, and foundational research, and then summarize the content before passing it to ChatGPT-o1-preview. This workflow, where ChatGPT-4o helps structure the context and prompts for ChatGPT-o1-preview, proved to be highly efficient.

By doing this, you can use ChatGPT as a “second brain,” freeing up mental resources for other tasks. This allows you to focus on system architecture planning, environment setup, complex debugging, and more, thereby accelerating work across a wider range of fields. I am confident that this approach will significantly boost development speed. It’s truly revolutionary, and I’m personally very impressed.

I’m looking forward to its official release and to being able to use it more frequently.

4 Likes

A post was split to a new topic: ChatGPT down for anyone else?

Will the current Gpt4o models become cheaper after o1 models are fully released?