Voice Instruction with gpt-4o-mini-tts

I’m trying to generate speech using gpt-4o-mini-tts with onyx voice and the following instructions:
Talk as a 38 year old Male with strong Boston accents.
Delivery: Natural conversational with strong Boston accents
Emotion: reserved

However, it doesn’t seem to work.
Anyone has a trick to make it sound like someone from north east US?
Thanks!

1 Like

Check out this thread:

1 Like

This, and also for a more concrete prompt maybe use:

“Adult male from Greater Boston. Non-rhotic (drop post-vocalic R), slightly flattened/clipped vowels, casual conversational rhythm, reserved emotion, no caricature.”

Oh and sparingly add in a couple hints where you type it how it sounds, like: “I left my cah outside.” / “We’ll meet near Hahvahd Square.”

Good luck!

2 Likes

Accents can be tricky with TTS models. In my experience, breaking the voice instructions into very simple, direct traits and avoiding age or emotional overload sometimes helps. Also worth experimenting with phonetic hints or example phrases typical to the region.

gpt-4o-mini-tts was updated and it’s broken now and doesn’t follow TTS instructions most of the time.

You can get the old behavior back by using gpt-4o-mini-tts-2025-03-20