TTS: add emphasis to one word in spoken text

wwessels1 · April 3, 2024, 6:56pm

Is it possible to add emphasis to one specific word in the text that I want to be spoken by the TTS (audio/speech) endpoint?

Let’s say I want to have the following text to be spoken:
“Are you still using this?”
The exact meaning can be very different when the emphasis is on ‘you’ or ‘still’ or ‘this’.

How can I convince TTS to put emphasis on a certain word?

Thanks!!
Wouter

vb · April 3, 2024, 8:10pm

Hi!

It’s somewhat possible but likely not going to work reliably. From the docs:

There is no direct mechanism to control the emotional output of the audio generated. Certain factors may influence the output audio like capitalization or grammar but our internal tests with these have yielded mixed results.

wwessels1 · April 3, 2024, 8:14pm

Okay, Thanks mate!
Perhaps it will be added in the future
Regards,
Wouter

sps · April 3, 2024, 8:19pm

Try these:

“Are you still using this?”

“Are you still using this?”

“Are you still using this?”

wwessels1 · April 3, 2024, 8:42pm

Interesting …
The input of the audio/speech endpoint is a String.
How do I give italics to that endpoint?

This is the sample code that I use to call the endpoint:

from pathlib import Path
from openai import OpenAI
client = OpenAI()

speech_file_path = Path(__file__).parent / "speech.mp3"
response = client.audio.speech.create(
  model="tts-1",
  voice="alloy",
  input="Today is a wonderful day to build something people love!"
)

response.stream_to_file(speech_file_path)

How do I tell a String that a word is italic?

vb · April 3, 2024, 8:53pm

You can try:

'TODAY'
'Today!!!'
'<em>Today</em>'

etc…
and why not:

'toDAY'
'2-DAY'

Or you can get creative and add another sentence to tell the model that it should put emphasis on the word:

'Jane wanted to tell everybody about today. When she adressed the crowd she put emphasis on the word ‘Today’ and then she said: “Today is a wonderful day to build something people love!” ’

You have to play around with it, especially since

tests … have yielded mixed results.

_j · April 3, 2024, 9:03pm

Here’s a technique - a little bit of pause and the AI has to do a little “reset” that sounds emphasized.

So, are … you still using this?
So, are you … still using this?
So, are you still using … this?

Combined from three different runs:

wwessels1 · April 3, 2024, 9:04pm

When I add to the text, I do notice some changes, but I do not know if the changes are due to the tags, or because every time the resulting speech is different anyway.

And sometimes the voice actually speaks the ‘em’ !!

wwessels1 · April 3, 2024, 9:05pm

Awesome, this is actually promising!!!
I will have to play around with this!!!

Thanks!!

sps · April 4, 2024, 5:10am

This is just plain markdown that’s being rendered as italic.

“Are you still *using* this?”

“Are you still using *this*?”

“Are you *still* using this?”

You’ll notice that the TTS will emphasize on the word between the * s

praveenmenon999 · June 29, 2024, 10:11pm

with my few attempts, i have noticed till now is that using certain words in CAPS helps, as also using hyphens( “-” ) helps slightly in emphasis. (Damn, that looks like a bird!)

Topic		Replies	Views
How to use verbal commands with the new TTS API? API	1	286	March 23, 2025
TTS - adding pauses to speech generations through some kind of input syntax API api , tts	9	10155	July 17, 2024
TTS voices have a clear US accent API tts	11	3372	January 8, 2025
TTS API Speed and Quality Issues API api , tts	5	3904	February 6, 2024
How to Fine-Tune Pronunciation with OpenAI's Text-to-Speech API? API tts	1	394	March 6, 2025

TTS: add emphasis to one word in spoken text

Related topics