Interesting article about the story behind the Responses API

New article on OpenAI developers blog:

Old thread with a bit more of details about how Chat Completions and Responses API were created:

Related article:

1 Like

Multimodal from the ground up. Text, images, audio..

That lets you know this is merely justification and spin.

Been getting audio in and out of Chat Completions for a year.

Responses has a “input audio” snuck in to “content” in the yaml spec for a few weeks - and the API denies it to this day (yet doesn’t outright deny the non-working “gpt-audio” model.)

Chat Completions is stateful server-side with audio, and it does actually expire when promised. Just pass the ID you received, sound familiar? Completely doable with reasoning also.

I am so excited about the responses api and Codex now seems to know how to configure it!!! Awesome

You make an interesting observation that I found also rings true: gpt-5 has training and effort put into it post “knowledge cutoff date” about the Responses API. So much so, that it will conflate that training into Chat Completions API call usage.

The AI model can’t reconcile a 2024 knowledge date injection system message (even on API) with the blog date of the endpoint’s introduction it knows without any searching … then producing that Responses was introduced May 2024.

Thanks for sharing! The Responses API looks like a solid step toward unifying completions, chat, and multi-modal into one consistent interface. Excited to see how it simplifies future development.