Can (custom) GPT speak and respond via voice?

khan.mosameer · November 11, 2023, 4:20pm

Hi Everyone!

I’m new to the community and I have limited programming experience. I was excited to see the new update where we can create a GPT with limited coding knowledge. I have begun creating GPTS on ChatGPT 3.5 on the Pro Plan using the new update.

I want to make a GPT where instead of having the user experience be typing out and having text based conversations, I want to have the conversation be able to happen with the GPT responding via voice and the user can also respond by voice.

Is there a way I can turn the GPTs responses to voice and also have the user respond via their voice and have the GPT respond via voice audio? So in short, I want to make the experience feel more like a natural conversation with someone rather than typing everything out.

Is there an easy way to do this? If so, how would one go about this?

anon10827405 · November 11, 2023, 4:24pm

Yes. It’s built in. If you are using the app all you need to do is press the headphone icon and it starts a continuous voice conversation.

It’s a little glitchy though, but that’s TTS engines in general.

If you are talking about API then yes, also possible. You just need to facilitate it yourself.

Opinion: You may want to consider something like Eleven Labs. They are kicking ass with TTS.

They offer a lot more control. You can create and tune your own model (can even use your own voice, I have David Attenborough as my personal assistant, it’s great!), it’s cheaper, there’s a vast library of available models, and they have some good prompting elements like SSML (I think it’s called) (it’s for pronunciation) and pausing.

dmisi98 · November 12, 2023, 11:43pm

Do they offer streaming option as well? So no need to wait the end of the audio.

dmisi98 · November 12, 2023, 11:49pm

Yes, as @anon10827405 said, if you download the app, you can just speak with your gpt. It’s an AMAZING experince!.

However, I’m making a React-Native app where my goal to reach the same. With extra feature > To generate images with dalle while its talking, so if I need a story + an img, I will hear the story and when its finish the audio, I will see the img as well (or even before).
At the current stage with GPT, it will speak the story and then start generating the img, which is little anoying.

I’ll do it open source so if you or anyone is interested, feel free to reach me aout

m1m2 · November 12, 2023, 11:50pm

Hi. After reading the information in this thread I have a question. Can I organize and set up simultaneous voice translation during an online meeting? If it is possible how can I technically realize it?

anon10827405 · November 13, 2023, 12:04am

Yes they offer streaming

dmisi98 · November 13, 2023, 12:25am

Did you tried in ChatGPT app the voice chat function?

m1m2 · November 13, 2023, 12:33am

dmisi98 I haven’t tried the voice chat feature yet, no interlocutor. I’m in the middle of the night, so I’ll try it in the morning. Do you have any positive experience using simultaneous interpretation with voice chat?

dmisi98 · November 13, 2023, 1:17am

I really love it. Only issue, it can’t make parallel jobs like speaking and generating img in same time. But possibli you do not need that

jrileydinsmore · November 20, 2023, 9:52pm

I’m a bit confused, everyone here is saying its built into the app, but when I use the app, there is no headphones icon and seemingly no way to talk to GPT.

Is this only included if you pay for 4.0?

s.jennings1990 · November 21, 2023, 12:42am

Hi. Yes, it is only available with the paid version. It simply doesn’t show as an option on the free one.

s.jennings1990 · November 21, 2023, 12:44am

I’ve taken your recommendation and am blown away with Eleven Labs, but is there a way to merge it with Chat GPT? I love the conversational capabilities of Chat GPT but love the natural TTS of Eleven Labs. It would be great if I could merge the two somehow.

Please note: I have very limited - almost no - coding ability!

anon10827405 · November 21, 2023, 12:52am

Unfortunately you can’t connect ElevenLabs with GPTs. It would only be something workable with Assistants (API)

Artgeek · November 21, 2023, 3:25pm

There is a chrome (and edge) extension called Talk-to-ChatGPT (its not an app, it only works in a browser) that allows you to chat with chatGPT and integrate elevenlabs voices.

You just need to enter your elevenlabs API. No coding required.

homesabcd · May 25, 2024, 1:38am

where can I get the gpt 4o to speak in voice?

mizeng0925 · September 29, 2024, 7:29pm

I created a custom GPT and specified in the settings that it should operate in voice conversation mode.

The voice output feature worked fine on the first day, but now it has suddenly stopped working. I’ve checked the settings and permissions, and confirmed that no changes were made.

I want to know where the issue might have occurred and how I can fix it. By the way, I only made changes in the instructions. TT

Topic		Replies	Views
Is it just me or do custom GPTs not support voice at all? GPT builders	4	525	January 1, 2025
GPTs with Custom Actions by Whisper API and TTS Feedback gpts	18	6489	December 4, 2023
To propose integrating a microphone functionality into the application Prompting gpt-4 , chatgpt , plugin-development , api	4	2323	January 7, 2024
Voice in the GPTs after update doesn't work GPT builders	14	3121	January 21, 2025
Can't use the interactive voice chat in the IOS app anymore Community chatgpt	34	7943	November 3, 2024

Can (custom) GPT speak and respond via voice?

Related topics