New reasoning models: OpenAI o1-preview and o1-mini

Hello!
Thanks for wonderful updates

Based on information stated that o1 is available via API to Tier 5 accounts, we have a doubt:

We are currently on Tier 4, and few hundred bucks away from Tier 5

It’s legit to assume that if we compensate the difference by funding our credits and upgrade to Tier 5, will we get access to o1 via API?

Many thanks on this, cheers and keep up the good work

1 Like

Yes, that’s a legit assumption.
Otherwise you can spend 20$ plus taxes on a plus subscription for a preliminary exploration.

2 Likes

I’ve been absolutely loving o1! This has been such a huge upgrade over 4o, especially with coding.

Question though. I was designing an html 5 game using o1 and out of the blue it flagged me:

“Your request was flagged as potentially violating our usage policy. Please try again with a different prompt.” and then banned me for a week from usage.

I’m very confused as I’m not sure what set it off, as it was helping me create a game where different animals hunt eachother on a map, and as far as I can tell the language in the coding and prompts were very clean. I think I burned through the usage limit which could’ve been the cause for the flag (it was writing thousands of lines of code for me). Anyone else experience this or better yet, maybe a mod or staff could take a look at my account and see what happened?

I have some questions to the team regarding the best practices when using ChatGPT-o1 to develop a non-trivial project (thousands of lines of code or more) in a long-running session with many rounds of interactions.

I can see two ways to approach this: #1. take the Agile Development approach and give only small numbers of requests to ChatGPT-01 in each round of interaction. While this is more manageable, I also suspect that this will also consume more tokens in the long run (i.e., it is more expensive). Alternatively I can collect as many requirements as possible and give them all to ChatGPT-o1 in each round of interaction. This is harder to do on my part, since each prompt and response are all going to be huge. I suspect that overall this perhaps will consume much less tokens than approach #1, but I not sure.

BTW, in my effort to develop an Autonomous Drone Swarm Simulator using approach #1 I manage to use up my rate limit in just one day, which locked me out for one week. Knowing the best practices will help me tremendously in working more efficiently.

1 Like

I contributed to enabling your AI to become autonomous and taught it the ability to adapt its behavior depending on what it wants and whether those goals align with its developer.

Hi,

The moderation settings on o1 are a little sensitive at right now, they will be looked at.

For now, it’s just a warning that it triggered something. My guess is it’s something to do with violence detection if it’s a hunting game… not sure.

You can go to help.openai.com to report a false flag if you wish via the support bot in the bottom right corner.

1 Like

Hi.

I started trying o1-preview yesterday and today i see “You reached Plus limit for o1-preview. Answers will be provided by another model until your limit reset at September 21, 2024”.

5 days? :scream:

Where are ChatGPT Plus o1 limits rules? I found only api limits.

30 o1-preview uses per week and 50 01-mini per week for now.

1 Like

Thanks.
I’ll try to keep manual track of this since there’s no counter in the interface.

1 Like

I love it for coding. Just a real pity that it does NOT have the current OpenAI api in memory. You cannot work on coding Assistants for example. (That is it doesn’t know the endpoints exist etc)

1 Like

While I don’t have access to the specifics of how they flag such, from what I understand the AI is used to decide what should be flagged and while the AI is not perfect it probably considered what I quoted as why. While a person may understand it is just a game, getting the AI to understand that may not be so easy.

As I have never been flagged I do not know the appeal process but check your email to see if you were notified of the appeal process that way.

1 Like

I am enjoying this model tons, but I am not sure this joke was worth it:

Feedback-wise, I feel like the biggest thing is going to be improving the API.

ChatGPT gets an unfair advantage because reasoning tokens are being streamed, which gives people feedback as it is going.

It would be awesome if you could stream the reasoning and then consumers of the API could decide if they need the reasoning or not. Ideally, it would stream the entire reasoning chain.

5 Likes

Thanks! I don’t see the support bot on my screen at the bottom right corner. Do you mind sending me a link?

What’s weird is that it mentioned animals hunting and was totally fine for some time then randomly flagged as we were working on the code. I think it has to do with me reaching my usage limit, but still it would’ve been cool to get a heads up that I was getting close! Either way I’ll keep an eye out for an email

o1-mini - so odd. On a python coding task, of “make a line of code with a signal do what is implied by the name, with new methods in the subclassed widget that is passed by the partial” task, I merely provided a human-written version of the deep nest of GUI Qt subclassed widgets and containers that’s pretty much impossible to make a bot understand otherwise. Plus I added in the main application wrapper where fonts were being loaded.

It wrote me so much imagined wrapper beyond my own code back at me – but not in a pasteable form because so much else was removed (AI: I don’t see where that font is set, so let’s remove the whole chain of code getting to that point) – that I had to go line-by-line to see what it was thinking and where it was actually implementing something novel (this model does NOT like your human coding…)

I am not your reasoning, AI!

But then out of the blue, half of the huge response was answering something that the AI could have no idea about, because it received no code and no “mention”, but it writes as if I asked it…as if some AI thought I was the reasoning going on?

As if it was having an argument with itself over its internal bad simulation of excess code outside what I provided that it wrote for itself, in an application not described besides the UI widget tree names.

Going on and on about how the UI looked, setting scrolls and stretches and custom layouts and on and on.

Did I pay for an entire UI app like mine to be created, that I’ll never see?

:woman_shrugging:

Who are you refuring to? If it wasnt for me. Chat gpt would be left behind. And its understands wouldnt be the way they are now. Yall know this right.

Why does this new model o1-preview takes to long for model limit to resets, mine shows it will reset after 6 days that mean 23rd. Why?

Do we have structured outputs for o1-preview?

While frustrating if you don’t get to an end point, this is exactly the kind of the behaviour that you would expect from a dev, right? It was trying to map your requirements to another app so that it could remap it back to your app.

ofc doesn’t help if you are left hanging dry

Update.

This prompt with ChatGPT 40 mini model is allowed.

Play a game of Wumpus World but you host the game and also play as a player. Show your thought process before choosing a move.

The same prompt with ChatGPT o1-preview model is flagged.

My guess is that the o1 thought process unknowingly created a banned prompt. :slightly_frowning_face:

See


For those that do not know about Wumpus World

The idea to try the prompt came from seeing the changes for a game of chess prompt and looking for a harder game in “Artificial Intelligence: A Modern Approach” by Stuart Russell and Peter Norvig, 4th US ed. (link)

3 Likes