TTS: WAV file is corrupted

Hello,
I recently started using the TTS Endpoint. When I send a request for a wav file, I noticed the maximum wav data size (and possibly other attributes) are corrupted.

My request is:

curl https://api.openai.com/v1/audio/speech \
  -H "Authorization: Bearer xxx" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "The quick brown fox jumped over the lazy dog.",
    "voice": "alloy",
	"response_format": "wav"
  }' \
  --output speech.wav

I can play the file, but I cannot load in in certain programs (such as Unreal Engine 5.3). When I use ffprobe, I get

[wav @ 0x55e3da9d7a80] Ignoring maximum wav data size, file may be invalid
[wav @ 0x55e3da9d7a80] Packet corrupt (stream = 0, dts = NOPTS).
[wav @ 0x55e3da9d7a80] Estimating duration from bitrate, this may be inaccurate
Input #0, wav, from 'speech.wav':
  Duration: 00:00:02.76, bitrate: 384 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 24000 Hz, 1 channels, s16, 384 kb/s

and with soxi

Input File     : 'speech.wav'
Channels       : 1
Sample Rate    : 24000
Precision      : 16-bit
Duration       : 24:51:18.49 = 2147483647 samples ~ 6.71089e+06 CDDA sectors
File Size      : 133k
Bit Rate       : 11.9
Sample Encoding: 16-bit Signed Integer PCM

(Notice the duration)
Did I do something wrong or might this be a bug? Does anyone know a way to fix this?

1 Like

Thank you very much, this tip helped me a lot!

1 Like

seems to be a bug. When I try to read an OpenAI wav file with ffmpeg, it outputs the warning: Ignoring maximum wav data size, file may be invalid

1 Like