Hi everyone,
we have had trouble consistently getting realtime audio output in peninsular Spanish. The voice samples seem to be skewed towards south american Spanish, we have had some success by asking with almost every prompt to “speak in peninsular Spanish” but every ~10th word or so still comes out wrong.
Normalmente digo que escriba en castellano, pero lo del acento sudamericano nunca se va del todo con realtime. Si te cambias a los modelos que solo hacen “speech to text” o STT tendras mas suerte pero no pueden charlar agilmente como gpt-realtime. !Suerte!
Hence, it should be more effective to specify the regional accent for the model to adhere to it.
You can also prompt the model to speak as a person from a specific city, like Madrid for Castilian, Bilbao for Basque, or Seville for Andalusian accents, respectively.
Ah very interesting. I sort of thought that the sample size was muddled with different accents without being clearly marked, it never occured to me that it might be not specific enough.