Advanced voice mode released 09252024

Today OpenAI finally released, Advanced Voice Mode to the majority or all of the paid plus subscribers.

Today I noticed some features missing and some limitations.

  1. You cannot use the video camera feature.
  2. You cannot attach or upload any content Like PDFs / docs or anything.
  3. You cannot paste content from the Internet to the chat for the voice to refer from, but it can refer to previous conversations YOU had with it only, it seems…
  4. It cannot access the Internet. Is updated to April 2023.
  5. Sometimes it reverts to the older chat variant.
  6. You cannot change to different models, like the new reasoning model 01 / 01 mini.
  7. You don’t have access to Sky and you cannot have the model sing. Sarcasm is fun.
  8. Also, it seems there must be a lot of people using this. So, I’m not sure if there is enough compute to service all the subscribers right now—mine is no longer working. No indicator of tokens being used up or etc…
    Update: Now it is loading. Also I am noticing a 15 min time limit :frowning:

Anything anyone else can learn about the new Advanced Voice Mode release?

All aside I see a lot of potential, Bring me these features ASAP OpenAI.

Gracias! :smile:

1 Like

I can confirm, the app told me, that I now have access to advanced voice mode, I had the new voice selection when I started the voice mode. And it even confirmed, when I talked to it, that it’s the new advanced voice mode.
But none of the features of the demos are working for me, it cannot imitate animals, it cannot sing, it doesn’t let me interrupt. And it can’t search the internet.
It’s as if it’s just the old voice mode, with a new voice and new UI?
I tried resetting and reinstalling the app, gave it all permissions, checked microphone settings, and restarted the device.
Nothing helped.

Is everyone experiencing the same on Android?
I tested on Android, the app is up to date, and has all Android permissions.

Update: since today it works for me too. Like I suspected, I only saw the notification for advanced voice mode in the app, and had the new voice selector, but it was still the old mode.

I’ll check on my S22 ultra when my 15 minutes daily limit returns I think later this afternoon.

It is the old version but this is mutilated to a useless app

Okay, so I tested on my S22 Ultra, and I was able to get it to work, with the interruption feature and everything the same as on my iphone.

1 Like

It is the new version but it is restricted as I mentioned above. I wouldn’t say it is completely useless, but certainly the 4o model in its own is much more useful rn.

Thank you for your response. I believe that an update should enhance and improve upon the previous version, rather than reducing its capabilities.

1 Like

I believe OpenAI is rightly going by its commitment to rigorous safety standards in deploying the Advanced Voice Mode.
Unfortunately, the balance with useability isn’t right yet.
Here are my observations and recommendations:

I think that limiting the useability of AI in pursuit of high safety standards is its own safety risk. In Uganda we have a legend about a girl whose mother was too protective they didn’t let her out of the house or do any work; the story had a frightening twist.

When such technology can only be used “usefully” by the mother company and their affiliates it pauses a great risk where the future leans more toward an AI Corporate Cabal system. Effort to encourage more productive use by more people globally is as important to AI Safety as preventing misuse. I am not entirely sure whether a carrot and stick scenario applies here, but it might. Notice I mentioned productive use not recreational use.

The following are some of my recommendations for balanced safe deployment for the Advanced Voice Mode and subsequent ChatGPT Models.

  1. Have more globally varied voices and accents to choose from so that people don’t feel cheated of the expected functionality. In many communities where English is a secondary tongue, people relate more with the local accents. The colonial/western accents tend to have a negative or aloof connotation. As a result, absorption of AI use will be further delayed in such communities expanding the already alarming tech gap.

  2. Avail fast response / low latency aspects in the API. Just the conversation-grade speed would be a great start. The other major Advanced Voice Mode features such as the ability to interrupt the model can be left for Open AI as they have rightly argued that they have to raise money somehow (discussion for another time). I have noticed in many of my experiments with API use, that the difference between a productive feature and a terrible one is very often the speed of the response. First time users tend to prematurely confirm their assumptions that the AI use case is another parlour trick when the response takes for ever. Maybe the API could have speed-focussed functionalities that can be charged differently.

:pray:t4:

1 Like

I like a YouTube channel called: Digital engine. They come out with content every month or two, but are consistent. A little bit more of an alarmist regarding AI— But you may enjoy.

Now that I have access to it, the advanced audio mode constantly keeps flagging my input, with no reason. Even if I ask about very basic things, it keeps flagging my input, especially when the conversation becomes longer. At the very most basic stuff, with honestly nothing malicious whatsoever being talked about.