How to Fine-Tune Pronunciation with OpenAI's Text-to-Speech API?

I am using OpenAI’s text-to-speech API and would like to fine-tune the pronunciation (e.g., speed, intonation, accent). In particular, I am looking for ways to address issues such as mispronunciations or to specify how certain words or phrases should be read. Does anyone know if there are additional parameters or methods to achieve this?
Here’s the current request setup:

javascript

const requestJson = {
    model: 'tts-1',
    voice: languageCode,  // Language or voice type
    input: text,          // Text to be spoken
    speed: speakingRate   // Speaking rate (adjustable)
};
const res = await fetch(requestUrl, {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json'
    },
    body: JSON.stringify(requestJson)
});
const data = await res.arrayBuffer();
const audioBlob = new Blob([data], { type: 'audio/mp3' });
const audioUrl = URL.createObjectURL(audioBlob);
return audioUrl;

Any advice or insights would be greatly appreciated!

1 Like