Webrtc Real Time API with microcontroller

olivier.ros · December 17, 2024, 7:36pm

Hi! In the Day 9 demo, we saw a stuffed animal with a microcontroller consuming the WebRTC Real-Time API (link: YouTube Live).

Could you provide more details about the overall architecture? For example, is there a local server involved (like Node.js), or does the microcontroller connect directly to the API?

Also, would this be feasible to implement on an ESP32?

Any examples you can share would be greatly appreciated.

Thanks!

jeffsharris · December 17, 2024, 11:17pm

we just published the source code for this!

olivier.ros · December 18, 2024, 9:27am

Thanks!
I will try to adapt your code for this ESP32 S3 Radio I have and post here if I success

Foxalabs · December 18, 2024, 9:39am

Will have to try this on an RP100.

vitaliy.hayda · December 18, 2024, 11:25am

I’m really hoping this will work @olivier.ros - please tag me as soon as you get it working, along with the steps to recreate it

Sean-Der · December 18, 2024, 5:52pm

@vitaliy.hayda @olivier.ros @Foxalabs I would love to make this platform agnostic! If you want to put up your WIP PRs I will make sure this ends up be something usable for everyone

Foxalabs · December 18, 2024, 5:54pm

I did a lib for Arduino to access ChatGPT back in 3.5 days Was a fun little task, hopefully will get some spare time over the holidays to go experiment with realtime on a bunch of hardware in my dev cave.

olivier.ros · December 20, 2024, 10:36pm

I made it work on the ESP32 S3 Muse Radio but it’s picking up it’s own response, I think webrtc has built-in noise cancellation but how to activate it?

gianpaj · December 30, 2024, 10:58am

I’ve ordered these components from M5stack. Hopefully, that’s enough to try to replicate this

Atomic Echo Base with Microphone and Speaker
ATOMS3R Development Kit with 0.85-inch Screen (8MB PSRAM)

m5stack [dot] com

(not sponsored or work for them)

prothan · January 8, 2025, 6:03am

I have tried the demo in the link above. I had difficulty getting it to recognize what I am saying (speaking in English). I notice it is set for a sample rate 8Khz. From my understanding the recommended sample rate for pcm16 is 24KHz with websockets. I thought this was the sample rate that the speech recognition model was trained for. From past testing I have had not had much success at 8Khz using pcm16. What is the optimum sample rate and settings with the opus codec. Also, the websocket documentation does not list the opus codec. Can opus only be used with webrtc?

Topic		Replies	Views
Realtime api example in c or c++ API api-realtime-speech	0	105	November 18, 2024
Realtime Websocket API issue in Mobile App (iOS) API realtime	3	463	November 25, 2024
What will the GPT-4o audio API look like? API audio , gpt-4o	9	3629	October 2, 2024
Playing audio in JS sent from realtime API API realtime	13	3017	January 9, 2025
Is realtime api directly speech to speech? API realtime , api-realtime-speech	13	274	January 14, 2025

Webrtc Real Time API with microcontroller

Related topics