Agents SDK compatibility: Realtime preview model or Azure OpenAI endpoints?

kathyh · March 13, 2025, 10:22pm

I’ve reviewed all the Agents SDK documentation and I’m not clear about model API compatibility, specifically:

Will it work with OpenAI Realtime-preview model?
Will it work with Azure OpenAI model endpoints? If so, is that limited to specific models?

I am about to start testing these things myself, but I will appreciate any time-saving directions anyone can recommend within the multiple approaches listed here: https://openai.github.io/openai-agents-python/models/
… or save me the time by shutting down my tests entirely if these models are just not supported now

Specifically around the Realtime model, which does not support Structured outputs out of the box, would that lead to the bad request error mentioned in the link above with some models that don’t support structured outputs?

_j · March 13, 2025, 11:23pm

I cannot imagine this would work in any fashion.

gpt-4o-realtime-preview is only compatible with the realtime endpoint, over a WebRTC or a WebSocket interface.

It requires an audio modality as input or output.

Agents SDK is specifically written for the Responses endpoint, which currently doesn’t support audio. Even gpt-4o-audio-preview, which you can use on Chat Completions, is not allowed on Responses, even showing in the model details.

You would have to make use of an external transcription tool or TTS, depending on what you are thinking about doing.

kathyh · March 14, 2025, 1:29pm

While I understand all your points, to be fair, they’re not touting the SDK as being only for Responses, it sounds like they’re going for it to be a more universal framework, so I’d be interested to hear if these previews will come closer together at any point.

I understand how to switch to Responses with external transcription and choose not to because I’ve achieved a more fluid voice first experience with the native multi-modal audio or text input <> output model so I do hope the team continues to push it.

Topic		Replies	Views
Fine-Tuning OpenAI's Real-Time API for Native Speech-to-Speech Audio Generation API gpt-4 , chatgpt , api	3	797	December 31, 2024
Is it possible to use Azure OpenAI models in Assistants API API assistants , assistants-api	7	6334	February 8, 2024
New tools for building agents: Responses API, web search, file search, computer use, and Agents SDK Announcements	49	2381	March 15, 2025
Voice differences between Realtime API and Text-to-Speech API realtime , api-realtime	1	586	January 8, 2025
Audio support in the Chat Completions API Announcements	13	3990	December 12, 2024

Agents SDK compatibility: Realtime preview model or Azure OpenAI endpoints?

Related topics