We represent a company that has developed an AI assistant for specific users. As we prepare to move this service into production, we are looking to improve and guarantee response speed for our users.
We came across the API Scale Tier and are interested in enabling it for our project. However, we could not find clear instructions on how to activate this tier or what the associated costs are.
Could someone from the OpenAI team clarify:
How we can request access to the Scale Tier?
What the pricing model is?
Whether the tier includes SLAs or latency guarantees?
You request access by first being an Enterprise customer. Expect that it is a level beyond the $200k monthly cap that can be reached by API tiers. The top has a link.
The pricing is by guaranteed input output units. For example, one unit of 30k input tokens/minute (equivalent to the rate limit of tier 1, btw), would be $3300 a month, subject to whatever wheeling and dealing your account manager can sell you. Thatâs a starting point to one-half of the minimum purchase needed.
I understand the confusion. The Scale tier is only available to Enterprise accounts, and based on your inquiry, your company doesnât meet the criteria for an Enterprise upgrade. The best alternative is a Team account, but this tier doesnât include Scale access.
@sps has flagged this to the team, and theyâll coordinate with support and sales to improve the process going forward.
Itâs probably not the answer you were hoping for, but I hope it clarifies things!
@sps Yes ! @vb Argh, thatâs the response I was unfortunately expecting. Itâs really frustrating to have to migrate to Azure just to get a dedicated endpoint or even provisioned TPUs.
I really hope OpenAI will offer a solution for Team users, or ideally everyone, that allows for ultra-low latency without requiring a $200k/month spending.