Hi everyone,
I’m testing the new gpt-4o-transcribe-diarize model for diarization (speaker separation) using the official Python SDK, but the API keeps returning this error:
Error code: 400 - {
“error”: {
“message”: “chunking_strategy is required for diarization models”,
“type”: “invalid_request_error”,
“param”: “chunking_strategy”,
“code”: “invalid_value”
}
}
Here’s the minimal Python example I’m using:
from openai import OpenAI
import os
client = OpenAI(api_key=os.getenv(“OPENAI_API_KEY”))
with open(“test.mp3”, “rb”) as audio_file:
transcript = client.audio.transcriptions.create(
model=“gpt-4o-transcribe-diarize”,
file=audio_file,
chunking_strategy={
“type”: “auto”
}
)
print(transcript)
The same code works perfectly with gpt-4o-transcribe and whisper-1,
so I assume the issue is specific to the diarization model.
Questions:
-
Is there any official format or schema for chunking_strategy?
-
Does the openai Python SDK (>=1.50) currently support diarization?
-
Any working example or preview doc available for gpt-4o-transcribe-diarize?
I’m testing this on macOS with Python 3.11 and openai==1.51.0.
Goal: transcribe and separate speakers for customer service calls (agent / client).
Thanks in advance 
— Big Mike
2 Likes
There is an example here in the docs.
Most likely your python package is outdated.
Try running pip install --upgrade openai and remove the chunking_strategy parameter, as the sdk defaults to “auto”.
This one works fine to me:
audio_filename="yourfile.mp3"
audio_file = open(audio_filename, "rb")
transcript = client.audio.transcriptions.create(
file=audio_file,
model="gpt-4o-transcribe-diarize",
response_format="diarized_json",
)
print(transcript.text, transcript.to_dict())
2 Likes
Thanks for the suggestion!
I actually tried exactly the example from the docs:
audio_file = open("yourfile.mp3", "rb")
transcript = client.audio.transcriptions.create(
file=audio_file,
model="gpt-4o-transcribe-diarize",
response_format="diarized_json"
)
print(transcript.text)
But I still get this error:
Error code: 400 - {'error': {'message': 'chunking_strategy is required for diarization models', 'type': 'invalid_request_error', 'param': 'chunking_strategy', 'code': 'invalid_value'}}
I’m on Python SDK 2.6.1, so it seems that this version still requires chunking_strategy to be explicitly set — even though the docs say it should default to “auto”.
How unusual, it works flawless for me.
When you specify the chunking_strategy parameter, does it still gives an error?
audio_file = open("yourfile.mp3", "rb")
transcript = client.audio.transcriptions.create(
file=audio_file,
model="gpt-4o-transcribe-diarize",
response_format="diarized_json",
chunking_strategy="auto",
)
print(transcript.text,transcript.to_dict())
2 Likes
Thanks so Much
i dont know why but your example works great
thanks again 
2 Likes