Realtime API extremely expensive

I only tried a few conversations in the playground, just small ones asking about a bedtime story, for a few minutes or so. Nothing special. Yet, these tests already set me back $3.

With the current pricing model, integrating the Realtime API into our SaaS applications seems unfeasible. I can’t see developers launching applications that depend on the Realtime API while offering users unlimited access to explore its features. The costs are just too high to justify any subscription model for end users. When this basic usecase already set me back $3, I can’t imagine how much this will cost when implementing tool calling and more advanced usecases :sweat_smile:

While I appreciate that quality services come with a price, at this rate, it feels more like an impressive tech demo rather than a viable solution for widespread implementation. The pricing would need to drop by at least 10 times to make it truly accessible.

12 Likes

Quick calculations from their numbers show 18$/hour so yes, more than minimum wage but hey, the bot can speak any language :wink:

On the serious side, I totally agree. It’s great for a tech demo but it’s totally unusable in the field. You can easily make a small company go broke if you leave one of these assistants available 24/7 on their website as, for example, a customer care representative.

5 Likes

Yes, I saw some results: 20,000 tokens cost $5.
With tool usage and multiple agents, you can become broke in the blink of an eye. :laughing:

You can get burn in Realtime :see_no_evil:

1 Like

I have two theories:

  1. It is to gauge how people react to pricing on-par with human labor. If above human level AGI with agents will be reached then the pricing could be about the price of a human work hour.

–OR–

  1. It is a step towards being a product company. So far OpenAI has heavily worked with partners, with special focus on startups and indie developers since the early days. Many of us are grateful for that. However, that might change in future. One politically correct way is to raise prices for new features so that it’s only feasible to use these features through ChatGPT subscription.

I hope neither is true.

1 Like

I interacted with the demo app for 90 sec and it costed me $0.65 for RT API and $0.01 for GPT-4o. We cannot sell any service that costs so much just for the API itself. I must admit that the voice experience is extremely good.

I have built a platform using Google Transcription and TTS and integrated with Twilio voice. My cost per minute for my platform is a fraction of the real-time API. Response times are acceptable for a humanly conversation. Anyone looking for voice interface on open AI at an affordable price, let me know.

realtime-mini-preview for gpt-4o-mini soon like stated in their post. Probably october 30rd when next devday in London

1 Like

It somehow cost me $28 (Canadian), $20 US in just one morning! I barely used it, just played around with it for a bit. It didn’t seem like I was even using it for an hour total. It makes for a cool party trick demo. I can make a few videos of it doing some cool accents and expressions, etc. However that’s about it. Otherwise it’s back to the regular API and whisper/TTS. That charge was just crazy for what I got out of it.

Historically these prices drop quite a bit from launch and shouldn’t deter anyone from trying to build a product with it.

The important factor is that there’s a good amount of control given to us. You can switch from RealTime API to the typical STT → TTS paradigm without much friction. So if the conversation involves longgggggg discussions and latency isn’t as important, switch it over.

Also, think of current things like tutoring, or customer service where the person on the other side (should) be paid a consistent minimum of $20/hour regardless of what’s happening.

I think people should look at the big picture here. We can have a single, scalable, “serverless” agent that works on all modalities, on multiple fronts, using the same source of truth (your company documents). This, in my opinion is inevitable.

1 Like

Their estimations of $0.06/min for input and $0.24/min for output seem to be a bit off. Even when push to talk is enabled I still feel like I am racking up a bill at a faster rate than that.

What I do not understand is minutes ago it showed token usage of 45K input vs. 4K output … for 3.6 minutes of talk… of which, there is just no way I recited 20K plus words… in 3.6 minutes… then… two minutes later it showed only 17K of which 1700 tokens out. What the heck? I was charged twice today for a total of $12 for 3.6 minutes???|
Previous to real-time, I was orchestrating my own AI assistant via Whisper and using the o-1-mini model, and for the entire month of September, my cost was < $6. This is with me conducting daily audio chats … so what gives ?

Something must be wrong with how this is calculating the tokens. I would not be able to afford to test, and impossible to use in production.

1 Like

I suspect it scales up as the conversation history gets longer, since it takes context in consideration.

But since just for a small test I spend $3, I’m not really willing to confirm that theory for now.

Anyways, I would be careful having longer conversations.

2 Likes

Look I’ve just had a short conversation using new Realtime feature and 20k tokens cost me $5 is it for real or a bug. It doesn’t match the price.

3 Likes

4 Likes

I’m seeing the same thing

about 14.5 cents per 1000 tokens, with a 1:7 i/o split

144 bucks per million

approximately aligns with pricing>

https://openai.com/api/pricing/

100 bucks in, 200 bucks out

pricey!

the example they give

*Audio input costs approximately 6¢ per minute; Audio output costs approximately 24¢ per minute

since the models speak very slowly and there’s no way to increase speech, it’s super expensive. (I don’t know if you get charged for unspoken (interrupted), generated speech)

I’m thinking the actual in/out ratio will be around ~ 1 min in to 2 min out, so an hour will cost you 1.80 + 9.60, approx 11 bucks an hour

More than minimum wage in some states!

4 Likes

Yep, it’s totally usable commercially at these prices for any sort of for-profit business unless you have millions of VC money to burn as a loss leader or for proof of concept.

I’ve talked to it for a minute it cost me 0.75 so an hour would be $45

5 Likes

I dug into that… My findings here:

3 Likes

This is similar to the cost of calls at the beginning of the telephone era :grin:

6 Likes

I just did another quick test where I watched the logs to see the deltas stream in. The audio streams in way faster then it takes to playback. The audio streaming finishes immediately after the text generation finishes.

While you can maybe roughly go off playback length time wise to assess cost it truly is tokens you’re paying for. The only way to minimize costs is to minimize the size of your prompt and maybe use max_tokens to restricted the generated output length.

2 Likes