One week with the Realtime API

edwinarbus · October 11, 2024, 12:43am

We announced the Realtime API last week at DevDay SF. It’s been amazing to see its adoption—here are some of the coolest examples we’ve seen so far. Let us know here if you have an example too!

PaulBellow · October 11, 2024, 12:52am

Welp, there goes all the free time I had saved up.

Thanks, Edwin!

Seriously, though, some of these sound very interesting.

Love to see what members of our community have been building!

LetsExperiment · October 13, 2024, 3:49pm

Sawyer Hood (@sawyerhood) on X

THIS ↑↑

I’ve only looked at a few of them so far (i started from the bottom), but the description for this implementation seems to sell the creator’s work short.
He’s not just browsing the web with his voice, his application is acting like an agent.
It’s pausing to ask for his input, it’s adding things to check-out carts, and even removing things from the cart (upon request that it added incorrectly).
This seems like something that OpenAI is suggesting will exist in the next year, yet here is it already?

xavier.basset · December 17, 2024, 10:35am

Hey there!

A couple weeks ago, I’ve had the opportunity to participate in the OpenAI Builders Lab in Paris, where I was able to explore the potential of the Realtime API. I was amazed by how quickly – 6h – I could build a cooking assistant prototype with a real-time web-based voice interface that uses function calls. And a

To go further, as I explored the possibilities beyond this first use case, it became clear there were some challenges to overcome before it could become a deployable product.

Since then, I’ve pushed the effort and learned a lot about reliability improvement through WebRTC, Dynamic UI using Remote Procedure Calls x Function Calls, … and much more.

I’m happy to share what I learned as an open-source groundwork:

What’s in it:

Web Interface: Intuitive and responsive UI built with HTML, Tailwind CSS, and Alpine.js.
Real-time Communication: Leveraging LiveKit for seamless audio (and soon video) streaming.
AI-Powered Responses: Integration with OpenAI’s language models for intelligent and context-aware interactions.
Transcription and Summarization: Recap and summarize conversation transcripts for easy reference.
Customizable Roles: Define various roles for the assistant, each with unique instructions and configurations.
Session Management: Handle user sessions with capabilities to start, terminate, and summarize conversations.
Agent Authentication: Manage Flask sessions and LiveKit tokens.

Prerequisites

LiveKit Account
OpenAI API Key

There’s a lot more in the project doc, I hope it helps. And feedbacks welcome

And all I want for Christmas is a “Realtime Vision” in the API
-ping @romainhuet @katiagg

Topic		Replies	Views
Introducing the Realtime API Announcements	28	8429	January 16, 2025
OpenAI Realtime API w/ Twilio + RAG == AI Call Center Community project , rag , realtime	4	4735	November 5, 2024
Real-time voice conversations with GPT-4o photo/video support Community gpt-4 , project , api , gpt-4o , api-realtime-speech	2	1199	November 11, 2024
Exploring a New UX for Multi-Agent AI with the Realtime API Community gpt-4o , o1-preview , realtime	0	406	December 4, 2024
46a.co -- AI app builder / no-code platform Community project , agents	5	683	June 28, 2025

Related topics