TTS - adding pauses to speech generations through some kind of input syntax


I would like to be able to tell TTS via some type of input character combo when to add pauses between words and to what length. Does this exist now? If not can I recommend that OpenAi consider adding this feature.

Hey champ! And welcome to the community!

Have you tried adding these pauses as you would normally, like this… or – this? :laughing:

1 Like

I have not tried… just looked at documentation and did not see anything about how to add pauses. I had hoped to have rather lengthy pauses however, such as for a guided meditation script where the length of the pause would in fact be longer than typical in some instances. Is there a formula where each “…” or “–” is equal to a certain amount of time whereby stacking them would accomplish my objective?

The TTS model will appear to take some directions into consideration. [pause] for example will give it pause, not necessarily a length, but breaking up the speech, along with paragraphs or the ellipsis mentioned.

Perhaps best example of a handful:

The [pause] syntax is good to know. I like that it is intelligent enough to not read it and that it does pause. I experimented a bit just to see it I could lengthen the pause and it seems that all pauses are created equal. Would be nice if we could add something like [pause:10] or even if it could interpret meaning from [very long pause].

I haven’t tried SSML yet, but if it works that would be real fancy.

1 Like

I find the [pause] syntax often works, but not always.

Pauses between paragraphs are usually added automatically without extra syntax, but again—not always.

This unpredictability is hard to handle, and it’s one of the main reasons I’m not using the TTS API in production for now.

Full SSML support would be wonderful…

1 Like

@peterhartree Is there one you have found for production?

I found:

to have an effect, and can stack them,
but still not consistent.
really looking to find solution.

We had another topic in the meantime where the solution was a code implementation.
It may work for your case as well.
Otherwise the situation hasn’t changed according to what I know.