Adding pauses to TTS for generating meditations

ben21 · November 15, 2023, 1:28pm

Is there a way to encode timed pauses in the TTS audio file output?

I’d like to generate guided meditations with 2-3 minute pauses

sps · November 15, 2023, 1:34pm

Hi @ben21

Welcome to the OpenAI dev forum.

In my knowledge, pauses for that long duration can be implemented by code.

Simply generate two separate files and then append a pause after the first one before concatenating the second one.

ben21 · November 15, 2023, 1:48pm

Thanks for the welcome and the advice sps! I have started with this workaround

alessandroamenta1 · January 22, 2024, 6:30pm

Hey Ben! wondering if you found a workaround that works well enough - i have a similar issue for something im tryna build and havent found a decent solution yet.

ben21 · January 24, 2024, 6:33am

Hi there, I managed a basic hack to break it up to segments and place pauses in between.

Here’s my code:

from openai import OpenAI
from pydub import AudioSegment
import os

# Initialize final_audio as a silent segment of zero duration
final_audio = AudioSegment.silent(duration=0)

# OpenAI API Setup
client = OpenAI(api_key="YOUR_KEY")

# Guided Meditation Script Segments
segments = [
    "MEDITATION_TEXT_1",    "MEDITATION_TEXT_2"
]

# Define a one-minute pause
one_minute_silence = AudioSegment.silent(duration=60000)  # 60,000 milliseconds

# Generate and combine segments with pauses
for i, segment in enumerate(segments):
    # Generate the audio using OpenAI's text-to-speech
    response = client.audio.speech.create(
        model="tts-1",
        voice="onyx", 
        input=segment,
    )

    # Save and load each segment
    temp_audio_file = f"segment_{i}.mp3"
    with open(temp_audio_file, "wb") as f:
        f.write(response.content) 

    segment_audio = AudioSegment.from_file(temp_audio_file)
    final_audio += segment_audio

    # Add pause after each segment except the last one
    if i < len(segments) - 1:
        final_audio += one_minute_silence

    # Clean up the temporary file
    os.remove(temp_audio_file)

# Export the final audio file
final_audio.export("audio.mp3", format="mp3")

print("Audio with pauses created successfully!")

Topic		Replies	Views
TTS - adding pauses to speech generations through some kind of input syntax API api , tts	9	7722	July 17, 2024
How to decrease the latency of Text-To-Speech API? API gpt-4 , api	6	3010	April 26, 2024
TTS Feature Request - Use Structured JSON for A:B TTS Output API api	5	66	November 21, 2024
ChatGPT API TTS streaming API api	3	3718	January 21, 2025
Calling TTS from a Swift app API swift	9	2426	April 13, 2024

Adding pauses to TTS for generating meditations

Related topics