AAC files have wrong duration

adigladi · April 16, 2024, 10:37pm

I have noticed that the AAC files generated from audio/speech have a faulty file length. This is apparent when opening the files in the firefox player or via the previewer in OS X.

This does NOT seem to happen for the default mp3 type, but consistently for aac (I’ve seen it happen on other types as well, although I haven’t tested those as much).

Repro:

curl 'https://api.openai.com/v1/audio/speech' \
  -H 'authority: api.openai.com' \
  -H 'accept: */*' \
  -H 'authorization: Bearer <token>' \
  -H 'content-type: application/json' \
  --data-raw '{"model":"tts-1-hd","input":"This is a shorter sentence, but the problem seems to be more prominent with longer texts that are over a minute long at least.","voice":"nova","response_format":"aac"}' \
--output audio.aac

I can’t attach a link or a zip to this topic with examples, but I have an example with aac/mp3 if needed.

_j · April 16, 2024, 11:14pm

aac alone is actually not a file format - it needs a container if you are not simply rendering it.

OpenAI doesn’t seem to know this, instead giving an ADTS stream in the file.

They use libfaac 1.30.

The AAC cannot be decoded by neroaacdec, instead giving a “moov box not found” error.

It lacks the metadata to determine the play time of a VBR file.

You could mux it into a mp4. Or just request a different format.

my audio:

ffprobe -v error -show_entries stream=codec_name,bit_rate,duration,r_frame_rate,avg_frame_rate -of default=noprint_wrappers=1 audio.aac
codec_name=aac
r_frame_rate=0/0
avg_frame_rate=0/0
duration=19.225456
bit_rate=51917

(the audio plays for 21s)

adigladi · April 17, 2024, 9:04am

Thank you for the thorough info I’m trying to avoid client side muxing, but might have to go that route then. Would be great if the returned aac files could be in a m4a/mp4 container from the start though.

Topic		Replies	Views
TTS: WAV file is corrupted API tts	2	1097	May 17, 2024
Audio file (mp3) from v1/audio/speech generate wrong mp3 file API	1	886	January 25, 2024
/audio/speech: truncated audio for some single word strings Bugs api , tts	6	1409	December 1, 2023
WhisperAI API Not Recognizing Valid File Formats API whisper	5	4490	December 15, 2023
Whisper API fails on "large" ogg files (still below 25MB) Bugs whisper	2	911	April 15, 2024

AAC files have wrong duration

Related topics