"Speak faster" instructions that work for Real Time API?

The standard pace of each of the voices is slower than most people talk. When you’re in a conversation and you ask it to speak faster, it will do so for the next sentences.

However, I’ve not been able to get it to speak faster from the beginning, regardless of the words I use in the instructions - except the over the top “kid on sugar” example, that has a moderate effect.

Anyone who’s had success with specific instructions?

15 Likes

I have same issue too, the real time api sounds very unnatural and slow, at least compared to advanced voice mode

1 Like

Just bumping this up one more time. Considering the likes it got, @jeffreyhuhao and I must not be the only ones running into this. Did anyone find an answer?

Same problem here. Advanced voice mode does it with the same prompt.

1 Like

I also am looking for a way to make the voice faster! Ideally it could be controlled by the end user via a slider → param input, but would settle for faster across the board.

oh, i did that once, but it made voice squeaky and making it slower made it deep, something to do with the frequency, i had a github pr that i closed a while back that did that

Dealing with the same here. Anyone find a solution?

We’re experiencing the same issue and haven’t found a useful solution yet. I tried supplying additional system messages like “speak faster than normal” during response.create as custome context, but that didn’t seem to help much either. In general, my observation is that as the conversation progresses, it tends to lose its ability to consistently follow the supplied instructions.

1 Like

This is the samplerate. While this does make the playback faster and slower, typically this is probably not the approach that most people here want.

I think we want it to generate the voice speaking faster natively, without needing to speed it up by manipulating the audio ourselves.

I haven’t tested this yet, but the newer voices should do this just fine, right?
Maybe you can get a good outcome by having the prompt as

You feel uneasy and have to speak incredibly fast as the current situation is very stressful and youre on a deadline to finish it.

Gaslighting AI, if you will.

Cheers! :hugs:

Anyone get the speed figured out?

Facing the same problem here. I tried these prompt elements:

  • Speak faster
  • Speak as if the matter is urgent
  • Speak at normal conversational pace
    None of these make any of the voices sound conversational, they all sound slow and robotic.

Anybody found a good solution yet?

For anyone still hunting for this, the new speed parameter works great in the realtime API. You can find it under https://platform.openai.com/docs/api-reference/realtime_sessions/create and search for “speed”. Description: “The speed of the model’s spoken response. 1.0 is the default speed. 0.25 is the minimum speed. 1.5 is the maximum speed. This value can only be changed in between model turns, not while a response is in progress.”

I notice whenever I modify the speed parameter the voice changes and the instructions are completely ignored. In some occassions the model started to speak an entirely different language. Did you have similar problems? This was when I sped up the voice.

Oh no! I have so far only tested that it did indeed speed up the voice, and haven’t done more extensive testing yet. Thank you for the warning!