We announced the Realtime API last week at DevDay SF. It’s been amazing to see its adoption—here are some of the coolest examples we’ve seen so far. Let us know here if you have an example too!
Welp, there goes all the free time I had saved up.
Thanks, Edwin!
Seriously, though, some of these sound very interesting.
Love to see what members of our community have been building!
Sawyer Hood (@sawyerhood) on X
THIS ↑↑
I’ve only looked at a few of them so far (i started from the bottom), but the description for this implementation seems to sell the creator’s work short.
He’s not just browsing the web with his voice, his application is acting like an agent.
It’s pausing to ask for his input, it’s adding things to check-out carts, and even removing things from the cart (upon request that it added incorrectly).
This seems like something that OpenAI is suggesting will exist in the next year, yet here is it already?
Hey there!
A couple weeks ago, I’ve had the opportunity to participate in the OpenAI Builders Lab in Paris, where I was able to explore the potential of the Realtime API. I was amazed by how quickly – 6h – I could build a cooking assistant prototype with a real-time web-based voice interface that uses function calls. And a
To go further, as I explored the possibilities beyond this first use case, it became clear there were some challenges to overcome before it could become a deployable product.
Since then, I’ve pushed the effort and learned a lot about reliability improvement through WebRTC, Dynamic UI using Remote Procedure Calls x Function Calls, … and much more.
I’m happy to share what I learned as an open-source groundwork:
What’s in it:
- Web Interface: Intuitive and responsive UI built with HTML, Tailwind CSS, and Alpine.js.
- Real-time Communication: Leveraging LiveKit for seamless audio (and soon video) streaming.
- AI-Powered Responses: Integration with OpenAI’s language models for intelligent and context-aware interactions.
- Transcription and Summarization: Recap and summarize conversation transcripts for easy reference.
- Customizable Roles: Define various roles for the assistant, each with unique instructions and configurations.
- Session Management: Handle user sessions with capabilities to start, terminate, and summarize conversations.
- Agent Authentication: Manage Flask sessions and LiveKit tokens.
Prerequisites
- LiveKit Account
- OpenAI API Key
There’s a lot more in the project doc, I hope it helps. And feedbacks welcome
And all I want for Christmas is a “Realtime Vision” in the API
-ping @romainhuet @katiagg