GPT-4o Audio Access for API

sheikh.alireza · May 13, 2024, 7:26pm

I’m looking at GPT-4o and see options for text and vision but no options for voice. Would that be up coming? Also, would there be an option for audio input and output like you guys demo earlier. Very exciting features. I love hiw you tackled latency issues.

Thiago · May 13, 2024, 7:37pm

Found this on another thread:

So it seems like we won’t be able to use audio for now…

allenjlawson · May 13, 2024, 7:43pm

So bummed!!! I was looking for this, too. UGH! What a tease, OpenAI!

sheikh.alireza · May 13, 2024, 7:51pm

Yea, we need Audio in/out. Wondering what would happen if we send encoded Audio rather than image since multi modal. We need audio out though to deal with latency problem.

NotFenixio · May 13, 2024, 7:58pm

According to the announcement, video and audio inputs will only be available to a small group of partners for now.

sheikh.alireza · May 13, 2024, 8:00pm

How can we get on thay small list ?

NotFenixio · May 13, 2024, 8:27pm

I don’t really know, but I think by partners they mean Microsoft and so on.

jedashford · May 13, 2024, 8:28pm

Bummer. Talking to chatgpt is mildly interesting, but using the tech to enable my business is most interesting.

owencmoore · May 13, 2024, 8:32pm

It’s not available yet.

gpt-4o is live in the API with support for text and vision modalities. Support for audio is coming in the following weeks to a small group of trusted partners.

sellum · May 13, 2024, 11:30pm

And then for everyone I hope?

lamannab.b · May 14, 2024, 4:08pm

Buon giorno io non riesco ad inviare istruzioni vocali a chatgpt 4o

mccoyb00 · May 14, 2024, 5:35pm

None of that was mentioned that I saw in the presentation. This outright deception is getting old. Every company does this every time. They make a promise or show you something and then say “it’s going to be released today” and then you read the fine print and it’s not actually true. Or it’s true to people who pay a lot of money. In other words what we got was an upgrade speed and price. Which is great. But not what was implied and who knows when we will get access to it. This is an example of how the rich get richer and the poor get poorer. All the advantages are given to those who can pay much much more than anyone else. AI is being refined at its current level. I don’t believe we will get much further than this anytime soon. This is not logarithmic growth or exponential growth as we’ve been promised now for two years. This is giving us crumbs left over by the giant fish in the pond. I’ve been using gpt4 since it came out. Probably about 200 hours over the course of almost 2 years. There’s nothing markedly different about gpt4 and gpt4 now. Yes it’s much better but these are refinements. Good ones. But it’s incremental. This is why I’ll never use gpt4 as my main api even though I’m at tier 4. This behavior is getting old by these fake product launches.

sheikh.alireza · May 14, 2024, 7:18pm

Great points. It kindda felt like that for sure. Whatever “a small group of trusted partners” mean. I hear you. Yea, we do the same try to keep our LLM layer flexible and use multiple providers. This latency thing for audio is a game changer though.

spacemoo · May 15, 2024, 2:01am

Is there any possibility to become a trusted Partner ?
Thanks in advance

tkrynski · May 16, 2024, 2:15pm

Can you release the API documentation for voice mode so we can start planning for how to integrate it?

stephen.w.c.j · May 19, 2024, 10:24am

I’m building my ios app, and all I need is that voice api. Please release it soon.

oded1 · May 19, 2024, 2:02pm

We are a heavy GPT user. How can we become A Trusted Partner?

jakke.lehtonen · May 19, 2024, 6:26pm

If you have to ask here, never. Sorry. But on that situation you aren’t important enough to OpenAI.

giovannisimo · May 20, 2024, 5:51pm

Is it correct that for now to go from speech2speech we need to run speech2text transcription → chat completion → text2speech synthesis?

spmarketinguk · May 20, 2024, 5:53pm

I thought they just announced it was available for everyone?.. I wish they would tell the truth in these hyping, back-patting videos.

Topic		Replies	Views
When will audio to audio be released for gpt-4o please? API gpt-4o	8	4712	July 2, 2024
Will the API for the New Voice Be Released Separately? API	4	2989	September 3, 2024
GPT-4o text to speech and speech to text API	19	19017	September 30, 2024
GPT-4o New Voice Model, API Release API	21	22221	July 23, 2024
Advanced Voice Mode for API API	22	19089	October 5, 2024

GPT-4o Audio Access for API

Related topics