We’ve developed a new series of AI models designed to spend more time thinking before they respond. They can reason through complex tasks and solve harder problems than previous models in science, coding, and math.
Today, we are releasing the first of this series in ChatGPT and our API. This is a preview and we expect regular updates and improvements. Alongside this release, we’re also includingevaluations for the next update, currently in development.
It’s finally here!
We have two new models to play with, o1-preview and a smaller, faster, and cheaper o1-mini.
So excited to start playing with this model, just upgraded to tier 5 so I can use it via the API!
It wrote basically the same paper as gpt-4o but said a few more words and gave it a little better structure. It did add a short section on mitigation strategies which gpt-4o missed so it did technically write a better paper.
For some comparisons on this specific tasks. The paper gpt-4o wrote is around 1,100 tokens and the paper o1 wrote on the same topic is around 2,500 tokens. My dedicated paper writing system generates a paper around 4,400 tokens in length for the same topic. That system takes a lot longer then o1 but it has web access so it grounds the paper in about a dozen different web searches…
I tested the o1-preview model to solve the first question of IMO 2024. It thought for 70 seconds and gave the wrong answer, as expected. What I didn’t expect was that the wrong answer was the same as GPT4o, with no improvement in scope or comprehensiveness.
No they marked it as deprecated for all models… Every single OpenAI consumer now needs to change the word max_tokens to max_completion_tokens. That’s just dumb…