Gpt-4o-mini-tts API returns empty response

There must be an outage. We are sending much less than 4096 characters.

Has been working for months until now.

1 Like

Do you have particular parameters to report on? Has an issue passed in the six hours since no other replies?

I got speech audio back, mp3 output:

Nothing out of the ordinary. No streaming. Here is a test:

{“model”: “gpt-4o-mini-tts”,
“voice”: “alloy”,
“input”: “This is a test. This is a test. This is a test.”,
“instructions”: “Speak in a clear, deliberate, and professional tone.”}

No error code - still just a empty response which is unusual…

https://www.openai.fm/ seems to be working fine.

Let’s see if any other complaints come in. Unless nobody uses TTS :face_with_raised_eyebrow:

I made a half-dozen calls emulating what you show, all success.

The only thing you didn’t specify is a requested file format. This Python script requests the format of your specified output file name’s extension.

'''OpenAI audio-speech TTS, generate speech from instructions and text'''
import json
import os
import httpx
from pathlib import Path


def save_audio_stream(input_text: str,
                      model: str = "gpt-4o-mini-tts",
                      voice: str = "alloy",
                      save_file: str = "output.mp3",
                      instructions: str | None = None,
                      **kwargs) -> None:
    """Saves streamed TTS audio data to a file using direct HTTP/2 requests with httpx."""
    api_key = os.getenv("OPENAI_API_KEY")
    if not api_key:
        raise EnvironmentError("OpenAI API key not found. "
                               "Please set the OPENAI_API_KEY environment variable.")

    path = Path(save_file)
    valid_formats = ['mp3', 'opus', 'aac', 'flac', 'wav', 'pcm']
    file_extension = path.suffix.lstrip('.').lower()
    if file_extension not in valid_formats:
        raise ValueError(f"Unsupported format: {file_extension}.\nUse: {valid_formats}.")

    payload = {
        "input": input_text,
        "model": model,       # 'tts-1', 'tts-1-hd', or 'gpt-4o-mini-tts'
        "voice": voice,       # Choose voice: STANDARD (fable, onyx, nova, shimmer)
                              # or NEW (alloy, ash, ballad, coral, echo, sage, verse)
        "response_format": file_extension,  # Desired audio format
        # Optional parameters (defaults shown, omit if None)
        "instructions": instructions,  # Voice style instructions (gpt-4o-mini-tts ONLY)
        #"speed": None  # Select a value from 0.25 to 4.0
        **kwargs,
    }

    # First try/except: perform the request and capture HTTP-level errors.
    try:
        with httpx.stream("POST",
                          "https://api.openai.com/v1/audio/speech",
                          headers={"Authorization": f"Bearer {api_key}"},
                          json=payload) as response:

            # Separate try/except so we don't continue if status is not 2xx.
            try:
                response.raise_for_status()
            except httpx.HTTPStatusError as e:
                # Attempt to read the entire body **once** – required for streaming responses.
                body_bytes = response.read()  # consumes the stream so call only here.
                err_detail = ""
                if body_bytes:
                    try:
                        err_json = json.loads(body_bytes)
                        if isinstance(err_json, dict):
                            err_detail = err_json.get("error", {}).get("message", str(err_json))
                    except Exception:
                        err_detail = body_bytes.decode(errors="replace").strip()
                print(f"HTTP error while fetching the audio stream: {e}\nDetails: {err_detail if err_detail else 'No additional details.'}")
                return  # Exit early – nothing further to do.

            # Second try/except: write the (now guaranteed) good stream to disk.
            try:
                with open(path, 'wb') as f:
                    for chunk in response.iter_bytes():
                        f.write(chunk)
            except (OSError, IOError) as file_err:
                print(f"Could not write audio to {path}: {file_err}")
                return

        return path

    except httpx.HTTPError as net_err:
        # Network / connection issues (DNS, timeouts, etc.)
        print(f"Network error while attempting to contact OpenAI: {net_err}")


# Usage of TTS function
input_text="""
This is a test. This is a test. This is a test.
"""
instructions="""
Speak in a clear, deliberate, and professional tone.
"""
tts_return = save_audio_stream(
    input_text=input_text.strip(),
    instructions=instructions.strip(),
    save_file="testing.mp3",
    voice="elvis",
)
print(f"If it didn't crash, you have an audio file:\nDownloaded: {tts_return}")

(updated the code so it will catch an error and give good info from the API about anything gone wrong. To make it actually work, change the voice at the end from being “elvis” to a real TTS voice)

If you get empty mp3 files instead of 60kB audio insisting “this is a test”, yet no error, I would try a new project/API key: see if it isn’t a provisioning or rights problem that is not raising an exception.

The only thing you didn’t specify is a requested file format.

The response format is optional - the default is .mp3. Never had to specify that before. Keep in mind: This has been working for months - since Gpt-4o-mini-tts was first released.

No issues with Project/API keys. I use many different APIs with no problem.

See recent usage below - today’s input tokens was 5791:

In usage, pick a range of just today, pick spend categories, and find the model way down in the list of API models…

I’m paying “successfully” for what’s sent and received in credits.

I can only suggest changes to the way you make the call, such as not using the SDK and its constant blocking and breaking, and ensuring you capture any body message with an error instead of JSON, as doing nothing different won’t fix this for you, and you’d want to give OpenAI substantial feedback of what’s gone wrong and that it is specific to your API organization and with no other cause.

@_j Well, captured the body and it is NOT an empty response. It’s 154k of a combination of Chinese and other characters. Example:

쓄娀㥜Ø㋞㣜挂赋댷ضⅢ즫ē靀꺋⁠悠躵斏ⶳ⦺璪⸀や䀰ᎀ䄑庖떕⯊襲獊첖阾ㅺዋힺ覣쬖㫗팒霍틑᱊뚶

In fact, I used Google Translate to translate a small section: 漠撡萦䦔韬莐㟚䆷孪 translates to “The desert is full of“

What is going on here?

1 Like

What you have there, and the F3 FF that cannot be rendered as a clue:

What if I open an MP3 that was received in a hex editor?
‘expected_prefix_hex’: ‘FF F3 C4 C4 00 5B 8C 39 E8 00 EE 1E DC 38 29 12’,

What if I then decode your “chinese” into UTF-16 little-endian?

'FF F3 C4 C4 00 5A 5C 39 D8 00 DE 32 DC 38 02 63…

You’ve got an MP3 file there.

Now just commit it to binary to disk. The API returns binary files, unless you are using the stream parameter and want to decode delta chunks of base64 instead.

You kidding…

So, “This is a test. This is a test. This is a test.“ produces 154k of Chinese and other characters and a small section just so happens to translate into “The desert is full of“ ??

That’s a sign of a bad language translator trying to make sense of a binary file being converted into random Unicode code points instead of being saved properly as a file.

I could probably decode a JPEG or any other binary data file into a stream of happenstance glyphs and have some dumb AI make nonsense translations out of it, also. Chinese characters have meaning, unlike syllabic, “spelled” Latin languages, so it is easy to ascribe semantics where there are none.


My code shown writes bytes “downloaded” to a file:

for chunk in response.iter_bytes():
                        f.write(chunk)

Whatever you’re up to, it just ain’t right!

Whatever you’re up to, it just ain’t right!

I’m contacting support@openai.com to straighten this out.

Better, just run my no-SDK code, see it saves files as expected, have an “aha” moment, and proceed in that style.

If you updated the openai SDK library, and that broke an app that was working, you could revert the version. OpenAI is announcing a new realtime model now, and they might have been messing with the libraries and methods.

Like I said, we don’t use SDKs. But if OpenAI is updating the model, then i’m all for that.

Thanks!

1 Like