New Audio Model Snapshots in the Realtime-API

Announcement from OpenAI Developers

New audio model snapshots are now live in the Realtime API with improvements to reliability, lower error rates, and fewer hallucinations:

  • gpt-4o-mini-transcribe-2025-12-15: 89% reduction in hallucinations compared to whisper-1

  • gpt-4o-mini-tts-2025-12-15: 35% fewer word errors as measured by Common Voice

  • gpt-realtime-mini-2025-12-15: 22% improvement in instruction following and 13% improvement in function calling



The new text-to-speech and speech-to-text snapshots are stronger in languages like Chinese, Japanese, Indonesian, Hindi, Bengali, and Italian.

Try the new models today:

https://platform.openai.com/audio/realtime

1 Like

This may be a stupid question: Why have snapshots on these models? Why not just upgrade gpt-4o-mini-transcribe and gpt-4o-mini-tts and gpt-realtime-mini to newer versions without snapshot naming conventions?

So now you have:

  • gpt-4o-mini-transcribe-2025-12-15 and gpt-4o-mini-transcribe
  • gpt-4o-mini-tts-2025-12-15 and gpt-4o-mini-tts
  • gpt-realtime-mini-2025-12-15 and gpt-realtime-mini

We would love to roll-out new versions of gpt-4o-mini-transcribe and gpt-4o-mini-tts to our customers, but not with snapshot naming conventions. Why? Because it seems so temporary.

1 Like

I may be missing some context here, so my first question is whether this strategy of always upgrading to the latest snapshot, for example by using a generic slug like gpt-4o-mini-tts, is the same approach you would prefer for other model types as well?

Regardless of benchmark improvements, we typically evaluate new models against real production use cases before rolling them out to users and customers?

I’m most likely missing something. Feel free to point me in the right direction.

1 Like

The version with a date in the name IS the model (and it cannot really be called a snapshot since OpenAI will continue to make small changes anyway).

The version without a date is a pointer to the currently recommended model, an alias

gpt-4o-mini-realtime-preview-2024-12-17 ← gpt-4o-mini-realtime-preview
gpt-realtime-mini-2025-10-06 ← gpt-realtime-mini
gpt-realtime-mini-2025-12-15

After a model has been rolled out, vetted and become “recommended”, the convenience alias may be pointed to that newer model. Or may never, as in the case of gpt-4o never being pointed to gpt-4o-2024-11-20 (newer than gpt-4o-2024-08-06).

If you want stability in applications, then you will use the versioned model when offered. Then you will test the new model like this release yourself, make adjustments needed to the prompting and parameters, and move to the latest version when you choose.

Do not let OpenAI decide when to switch on you and switch the behavior of your application by employing the alias. Got it?

1 Like

@vb @_j

The version without a date is a pointer to the currently recommended model, an alias

Ah, yes. I completely forgot about this - makes total sense now. Sorry for my confusion.

1 Like

I’m always happy to see new audio models, particularly TTS.

Apparently the new snapshot is having more trouble following instructions in comparison to the previous one (like “speak slower” or “[laughs]”), so it seems this might require more extensive tests to see what pros and cons it brings.

1 Like