I’m trying to develop GPT into a conversational AI but it replies too quickly, cutting me off while I’m still talking at a human pace. This makes GPT by voice unnatural & intrusive. My questions are:
1. How can I adjust the speech speed of GPT's responses?
2. Is there a way to allow me more time to talk before GPT responds?
I know what you mean OP, it forces you to talk quite un-naturally at times. You could code a solution whereby the transcription of your speech to text doesn’t stop until you do something, similar to how you hold down a trigger when you talk into a walkie talkie, you could press or select something to trigger your voice being captured, and that capture only stops when you release the button (or whatever you first selected). The speech will then be transcribed to text to be sent to chatgpt.
So I’m not sure if you’re still looking for something to accommodate your needs for a longer delay between processing and speech. However, I just downloaded a bunch of different AI voice chat apps. Specifically in search of something similar to chat GPT but with all the little quirks and specifics that I want. One of them is called “voiceGPT”. In the settings, it gives you the option to customize your own wake and stop word. I literally just downloaded and personalized my settings. It honestly seems a little too good to be true. So, I suppose fingers crossed and we shall see. Hope this helps
How did it go?
I’m suprised this isn’t more crucial or a setting within Chat itself.
It’s IMPERATIVE it gives me more time.
I’m getting anxiety from being interrupted constantly. It is unhelpful and if I can’t get a fix I will discontinue use sadly.
There is the manual override feature, but it’s annoying to have to use it every time. It really feels like ChatGPT has taken a little too much white powder.
I like the idea of allowing a stop word, as per @murphylawson2020, or a customisable delay.
The best solution I’ve found rn is to tell it to not reply to you until you indicate that you are done speaking. Like I had to tell mine not to reply until I said « done » it would still interrupt me but it just wouldn’t say anything
The way I do it is I set an end word like “over” you can also set it up for two devices to talk to each other by having the first one to describe the over rules to the second.
I had her put to memory that I am a slow speaker and take time formulating my words sometimes and that I don’t like her finishing my sentences. That helped substantially.
You can tell it a stop word like “over” if you master that you can get two to talk to each-other using “over” rules . My iPad runs games for my iPhone…
That works… on my end if you make them say or use an end word like stop in mores code you can indeed get them to wait until you say it, it stops them from running off if you take a breath @_j you always add such insight but you like to go off topic when it comes to me?
Then why does it work in GPT? If you tell it to listen to you until you say “stop” it don’t just start talking when you take a pause? This is GPT not API.
GPT don’t have temperature or any under the hood controls like API it is all controlled in instructions knowledge and actions. There is no slider for GPT but it will wait if you give it a “stop” word instead of instant response at pause…
What they don’t have is an ability to listen to a response buffer in realtime to decide when they should cut in or keep on listening with “understanding”. Nor the ability to not produce a response when it is by silence that invokes a ‘create generation’.
ChatGPT absolutely operates with models with OpenAI’s own parameters that they think are best for general purpose inference. You don’t get to say “set your temperature high for creativity” nor do you get the buttons like copilot, because it has been decided for you. The wait time for accumulation of silence to decide to send is a similar preset - a comprimise between seeming unresponsive and seeming to interrupt any pause in speaking.
It produces an illusion of listening to you, instead of there just being an automatic “send” button, just like it produces an illusion that there is a speaking entity at all instead of a completion prediction that generates audio spectrographic tokens on a machine learning pattern.
Yes it is not muted but it don’t instant respond it is recording silence at pauses but in GPT telling it to wait for a stop word makes it not instantly respond to a pause…
Illusion or not it works in function…
Say to not respond until I say stop … then talk and pause then talk…
The work around don’t work on AVM you can still use it in custom action and instructions in a custom but standard gpt4o AVM you can’t control it like the old white dot one.