Realtime API: Voice Pitch Change in mid conversation

I am facing some unique problem here.

When i am initiating a session with a specific voice with a Instruction to produce response in Non-English Language (ex: Hindi). Its giving the response however i am facing 2 problem here

  • Pitch changing during midway of ongoing response (ex. first 10 word in Pitch 1 and 2nd 30 word in pitch 2) Sometime it even switches from female to male (high pitch change)
  • Pitch change during conversation messages (message1: Pitch 1 and Message 2: pitch 2)

Please suggest the workaround

Model: gpt-4o-mini-realtime-preview-2024-12-17 & gpt-4o-realtime-preview-2024-12-17

Thank you

Did you find a workaround @dev_ivoz ? We see the same issue with these APIs. Pitch changing.

I do not see this with gpt-realtime or the new gpt-realtime-mini https://platform.openai.com/docs/models/gpt-realtime-mini

but I am only using western languages.

Hi, We are facing the same problem, it seems to change voice constantly. By any chance anyone found a solution?

I’ve started this issue with a specific agent. Huge changes in the tone and pitch within single sentences.

This agent has a more complex prompt that the other agents I have deployed.

Hi,

Do you have a code snippet that will reproduce the pitch changing behaviour?

We are not seeing this mid-sentence but between responses. The voice sounds so dissimilar that it sounds like a different “person” speaking and is very jarring. This issue got much worse around the same time that gpt-realtime-2 was announced and continues to be a problem.