Seeking clarity on limited availability of "o1-mini" and "o1-preview" models: Technical constraints or strategic decision?

As an enthusiastic developer exploring OpenAI’s offerings, I’ve noticed that the “o1-mini” and “o1-preview” models are not universally available via API. I’m curious about the rationale behind this decision and would love some insights from the community or OpenAI team.

For context, these models are part of OpenAI’s newer offerings, promising improved performance and capabilities. Their limited availability has sparked curiosity among developers like myself who are eager to experiment with cutting-edge AI technologies.

I’ve considered several potential reasons for this limitation:

  1. Technical constraints: Perhaps there’s a shortage of GPUs or other hardware necessary to meet widespread demand.
  2. Quality control: OpenAI might be gradually rolling out access to ensure optimal performance and gather feedback.
  3. Strategic decision: This could be part of a tiered access strategy to incentivize higher usage or spending.

If it’s due to hardware limitations, I wonder if a solution could be implementing very low rate limits (e.g. 50 requests per day) for lower-tier users. This would allow more developers to experiment with the models while managing resource allocation.

As a small-scale developer who has spent just over $10 so far, it’ll take some time before I reach higher usage tiers. While I understand the need for resource management, I can’t help feeling a bit left out of these exciting developments.

Has anyone else experienced similar feelings or have insights into this situation? I’m particularly interested in hearing from those who have gained access to these models – what has your experience been like?

Ultimately, I’m hoping for more transparency about the availability roadmap for these models. It would be incredibly helpful for planning and development purposes.

Looking forward to hearing the community’s thoughts and experiences!

1 Like

I have asked myself this question because I am wondering if it will have the same capabilities as GPT-4o or if it is intentionally being left on Chat Completions only. I suspect its not ready as an “assistant” or even able to retrieve files on chatgpt.com, so that is why its in its current state - but its hard to say for certain. They seem to have classed this as a “reasoning” model, which is supposed to be more costly and slower, I guess.

2 Likes
  1. “trust us bro” token billing, at 4x the cost of the underlying model, with massive amplification of the input into autorecursive context tokens of “read these policies, AI” text.

image

A wish for 50 requests per day would blow that “$10-so-far” away in a heartbeat and into arrears overages before shutoff. Talk of the expense on this forum is only overshadowed currently by realtime voice API taking that $10 in a quick single-user chat.

At least OpenAI finally gave this a discount off the elevated billing rate for any cached context (it was essentially tailored for) in the pricing page, but I have seen 0 tokens “cached” in usage, a figure which can only be inferred in total for the organization.

  1. New API methods that need reading documentation and adapting software and finding use-case, and following the ‘not for production’ part of ‘preview’.
1 Like

Well, right now, their own models are competing with each other, Anthropic Claude, Google Gemini, and Meta LLAMA… outside of that, it is not much of a competition. They are even, sorta, competing with Azure Copilot/GPT if you think about it. So it will be interesting to see if the price goes down. Historically it has to drum up demand and likely draw in more revenue. Who knows what monstrous resources this model requires more than GPT-4o, but I surmise it must be quite a bit of additional processing power required. When we start to see it integrated more with functionality in ChatGPT that the traditional models have, I think we could see the cost start going down, but it is all just conjecture.

1 Like