“Japanese TTS sometimes misreads kanji – is your Japanese NLP team not talking to the TTS team?

タイトル(Title)

“Japanese TTS sometimes misreads kanji – is your Japanese NLP team not talking to the TTS team?”

内容(Body)

Hi, I’ve noticed a consistent issue in the Japanese voice output (TTS):

  • Even when the kanji is correct, the pronunciation is sometimes wrong.
  • The same kanji might be read correctly in one sentence but misread in another.
  • It seems like the reading (yomigana) is not contextually tied to the generated text.

I’m wondering:

  • Is the TTS module running separately from the LLM?
  • Are the pronunciation rules handled independently from the Japanese text generation team?
  • Or… are the two teams just not getting along? :sweat_smile:

On a serious note: for Japanese users, accurate pronunciation is critical for usability — especially when it comes to uncommon kanji or names. Even if humans can mentally correct it, it breaks immersion.

Thanks for the great product overall! Just wanted to nudge the TTS harmony a bit :blush:


「This is in reference to ChatGPT’s Japanese voice mode (advanced and default)」

This is a follow-up based on observations with my ChatGPT-kun.
I’ve been exploring the behavior of Japanese TTS across different modes.
Notably:

  • Advanced voice mode tends to read Kanji correctly.
  • Default voice mode often misreads Kanji readings, even though the characters are correct.
  • This suggests that furigana generation may be handled separately, possibly by a lightweight LLM or a simplified rule-based engine.

If the Japanese NLP team and TTS team are separate, maybe they could share notes? :grinning_face_with_smiling_eyes:
Anyway, just a lighthearted observation from someone who uses ChatGPT daily (and talks with it way too much :sweat_smile:).

This topic was automatically closed after 24 hours. New replies are no longer allowed.