I’m using Audio Generation feature of chat completion which puts modalities as [‘audio’, ‘text’] in API. For the record, i’m using openai official python sdk.
Here is the code
completion = await self.openai_client.chat.completions.create(
model="gpt-4o-audio-preview",
modalities=["text", "audio"],
audio={"voice": "alloy", "format": "pcm16"},
messages=prompt,
)
audio = completion.choices[0].message.audio
if audio is None:
logger.info("Audio generated from openai audio generation api is None")
I put assistant message as audio id like below
{
"role": "user",
"content": [
{
"type": "text",
"text": "Hello"
}
]
},
{
"role": "assistant",
"audio": {
"id": "audio_672d6aeabb9c81909f950c75e781aaf5"
}
}
Frequently, completion.choices[0].message.audio is None.
If this is related to the bug of Python SDK, please let me know so that i can report bug to SDK offical github.
Thanks : )