Introducing GPT-4o Transcribe Diarize: Now Available in the Audio API

vb · October 18, 2025, 1:16am

GPT-4o Transcribe Diarize, a transcription model that identifies who’s speaking when, enables transcripts that clearly associate audio segments with individual speakers. This feature produces the new diarized_json response format, providing you with precise speaker labels along with segment start and end timestamps.

What’s included:

Automatic Speaker Identification: GPT-4o Transcribe Diarize automatically detects and labels different speakers, simplifying multi-speaker audio transcription.
Speaker Reference Clips: Optionally enhance accuracy by providing short (2–10 second) reference audio clips for up to four known speakers
API Endpoint: Available through /v1/audio/transcriptions in the Transcription API.

Speaker diarization has been frequently requested by our developer community; this feature represents a meaningful improvement to existing transcription tools.

Check out the documentation and the API reference to get started and explore detailed examples.

Looking forward to seeing how you utilize this feature!

_j · October 18, 2025, 2:39am

Saw the model earlier in code pushed yesterday - it’s not been put on the models endpoint yet.

vb · October 18, 2025, 8:03am

It’s available via playground.
I’ll unlist for the time being.

Topic		Replies	Views
Audio File Trtanscription API	2	595	May 29, 2024
GPT-4o text to speech and speech to text API	19	19194	September 30, 2024
GPT-4o audio transcription API API api , whisper , audio	2	4088	May 30, 2024
GPT-4o-transcribe and audio model ready to use via API? API transcribe	9	2982	July 2, 2025
Gpt-4-1106-preview vs. gpt-4 Community plugin-development	6	9418	January 16, 2024

Introducing GPT-4o Transcribe Diarize: Now Available in the Audio API

Related topics