Adding pauses to TTS for generating meditations

Is there a way to encode timed pauses in the TTS audio file output?

I’d like to generate guided meditations with 2-3 minute pauses

In my knowledge, pauses for that long duration can be implemented by code.

Simply generate two separate files and then append a pause after the first one before concatenating the second one.


Thanks for the welcome and the advice sps! I have started with this workaround :slight_smile:

Hey Ben! wondering if you found a workaround that works well enough - i have a similar issue for something im tryna build and havent found a decent solution yet. :slight_smile:

Hi there, I managed a basic hack to break it up to segments and place pauses in between.

Here’s my code:

from openai import OpenAI
from pydub import AudioSegment
import os

# Initialize final_audio as a silent segment of zero duration
final_audio = AudioSegment.silent(duration=0)

# OpenAI API Setup
client = OpenAI(api_key="YOUR_KEY")

# Guided Meditation Script Segments
segments = [

# Define a one-minute pause
one_minute_silence = AudioSegment.silent(duration=60000)  # 60,000 milliseconds

# Generate and combine segments with pauses
for i, segment in enumerate(segments):
    # Generate the audio using OpenAI's text-to-speech
    response =

    # Save and load each segment
    temp_audio_file = f"segment_{i}.mp3"
    with open(temp_audio_file, "wb") as f:

    segment_audio = AudioSegment.from_file(temp_audio_file)
    final_audio += segment_audio

    # Add pause after each segment except the last one
    if i < len(segments) - 1:
        final_audio += one_minute_silence

    # Clean up the temporary file

# Export the final audio file
final_audio.export("audio.mp3", format="mp3")

print("Audio with pauses created successfully!")