TTS no longer follows instructions parameter

Diego_Serrano · January 15, 2026, 2:15pm

From about a month ago the text to speech api is no longer following the instructions parameter. No matter what instructions I enter, the speech is always the same. It happens both accessing the api via the node lib ( “openai”: “^6.16.0”) and via the openai.fm demo, that directly posts a http request.

For example. the following instructions generate the same speech:

instructions: ‘Voice: Deep, hushed, and enigmatic, with a slow, deliberate cadence that draws the listener. Phrasing: Sentences are short and rhythmic, building tension with pauses and carefully placed suspense. Punctuation: Dramatic pauses, ellipses, and abrupt stops enhance the feeling of unease and anticipation. Tone: Dark, ominous, and foreboding, evoking a sense of mystery and the unknown.’

instructions: ‘Voice: Very happy.\r\n\r\nSpeed: Extremely fast.’

always using model ‘gpt-4o-mini-tts’

jeffvpace · January 15, 2026, 2:50pm

What voice and what language did you use?

Diego_Serrano · January 15, 2026, 2:55pm

Tried with coral, alloy and shimmer. With the input in spanish or english. The instructions always in english. Event tried formats mp3 and wav. Also tried with models gpt-4o-mini-tts-2025-12-15 and gpt-4o-mini-tts

Diego_Serrano · January 15, 2026, 2:58pm

Even with the docs example request, changing the instructions to something in line with “Speak in a dark, gloomy and slow tone” gives the same result

curl -X POST v1/audio/speech \
-H "Authorization: Bearer xxxx \
-H “Content-Type: application/json” \
-d ‘{
“model”: “gpt-4o-mini-tts”,
“input”: “The quick brown fox jumped over the lazy dog.”,
“voice”: “alloy”, “instructions”: “Speak in a cheerful and positive tone”
}’ \

ltnew007 · January 15, 2026, 3:03pm

YES! I noticed the same. I do a live stream where I used several distinct AI characters on screen and the TTS instructions are very important!

I used to get very expressive delivery but now everything seems monotone like the AI is bored. Last Saturday, this was working as expected but last night (Wednesday) the voice was different.

It seems to be an actual update to the TTS model. But I found a fix, I think.

It seems that gpt-4o-mini-tts and gpt-4o-mini-tts-2025-12-15 are broken but gpt-4o-mini-tts-2025–03-20 still works. I think. It seems to, I need to test it more but I think that must be what we were using before and things got switched up.

jeffvpace · January 15, 2026, 3:06pm

I just now tested with:

Model: gpt-4o-mini-tts

Voice: Marin

Instructions: Voice: Deep, hushed, and enigmatic, with a slow, deliberate cadence that draws the listener. Phrasing: Sentences are short and rhythmic, building tension with pauses and carefully placed suspense. Punctuation: Dramatic pauses, ellipses, and abrupt stops enhance the feeling of unease and anticipation. Tone: Dark, ominous, and foreboding, evoking a sense of mystery and the unknown.

You are right. The instructions were not properly followed - resulted in a rather bland mood.

EDIT: Just tested with a completely different voice instruction and the result was not much different from the first.

This is very disappointing. I don’t use snapshot models.

Diego_Serrano · January 15, 2026, 3:24pm

Tried with gpt-4o-mini-tts-2025-03-20 and the first tests seem to work!!! Thank you both!

hanieldgg · January 16, 2026, 5:34am

I noticed this too!

I have a daily automation that sends me an audio message using the gpt-4o-mini-tts model along with a fixed set of instructions for emotion and tone.

The audio I received on Tuesday (1/13/25) sounded great and matched the expected tone. However, the one from Wednesday (1/14/25) was awful!! Completely monotone, no expression at all. Just flat and boring.

I re-ran the automation manually, in case it was a one-off execution issue, but that didn’t help! Then on Thursday (1/16/25), I got the same strange result again.

I tried the recommendation mentioned in this thread and switched to the gpt-4o-mini-tts-2025-03-20 model. After a few manual runs, the audio now seems to have the correct tone and emotion based on the instructions, and the results are consistent with what I was getting before.

VeitB · January 16, 2026, 5:57am

The default snapshot for this model has been updated very recently and the newer model snapshot behaves decidedly differently from the old one.
I have created a little write-up here but agree that the older version is a lot more consistent at following style and tone instructions.

This is just to explain the root cause for some of the challenges reported in this topic.
Hope this helps!

tiffiana · January 21, 2026, 1:58am

I just tried 2025–03-20, and it seems to follow my instructions better. However, the audio quality with the latest model is much clearer with less Audio artifacts. Maybe it uses better audio tokenizer or vocoder to reconstruct speech.

darcschnider · January 21, 2026, 12:37pm

Haha, thats funny, I wasted time trying to figure out what happened. Thought one of my AI engineering systems changed something. good to know now I can sleep better at night.

Wonder if they are going to fix/retrain or if they know what the cause was.

Neoony · February 12, 2026, 5:04pm

gpt-4o-mini-tts-2025-12-15 is so awful compared to previous gpt-4o-mini-tts-2025-03-20

some of the most robotic and monotone TTS I have ever heard, the gpt-4o-mini-tts-2025-03-20 was actually great
All of the voices are completely changed and lost most of the tone and emotion
Just sounds awful, does not sound natural at all, the previous version did

Talking about using it over API with using instructions to set the tone and many other things
(text model analyzes the message to set the instructions for Accent, Emotional range, Intonation, Impressions, Speed of speech, Tone, Style, Whispering and thats sent to the TTS)

siga01 · February 13, 2026, 3:08pm

I agree!
The older model “gpt-4o-mini-tts-2025-03-20” It does follow instructions much better than the current model, but with much worse quality.

These need to be addressed or fixed with the newer models.

Topic		Replies	Views
Audio Quality Decrease in Gpt-4o-mini-tts API	5	318	April 22, 2026
New Audio Model Snapshots in the Realtime-API Community api-realtime , models	19	1824	May 26, 2026
Voice Instruction with gpt-4o-mini-tts Prompting gpt-4 , api	4	626	January 28, 2026
[gpt-4o-mini-tts] Instructions leaking into generated audio Bugs	4	438	August 4, 2025
Gpt-4o-mini-tts speed and unnatural voice Feedback tts	3	338	January 16, 2026

TTS no longer follows instructions parameter

Related topics