What is difference between TTS HD vs TTS?

According to the document description, tts-1 is optimized for speed, while tts-1-hd is optimized for quality. However, in about 30 Japanese text-to-speech tests that I conducted, tts-1-hd often read parts of the Japanese text with a strange pronunciation that was neither Japanese nor English.

Therefore, it is likely that tts-1 and tts-1-HD were trained on different datasets.

I have not confirmed whether this applies to languages other than Japanese, but which one to prefer may vary depending on what language is being used for the text-to-speech.

The cost is indicated as per character, so it is probably not per token.

2 Likes