Webrtc Real Time API with microcontroller

Hi! In the Day 9 demo, we saw a stuffed animal with a microcontroller consuming the WebRTC Real-Time API (link: YouTube Live).

Could you provide more details about the overall architecture? For example, is there a local server involved (like Node.js), or does the microcontroller connect directly to the API?

Also, would this be feasible to implement on an ESP32?

Any examples you can share would be greatly appreciated.

Thanks!

2 Likes

we just published the source code for this!

5 Likes

Thanks!
I will try to adapt your code for this ESP32 S3 Radio I have and post here if I success

1 Like

Will have to try this on an RP100.

I’m really hoping this will work @olivier.ros - please tag me as soon as you get it working, along with the steps to recreate it :open_hands: :yellow_heart:

@vitaliy.hayda @olivier.ros @Foxalabs I would love to make this platform agnostic! If you want to put up your WIP PRs I will make sure this ends up be something usable for everyone :slight_smile:

1 Like

I did a lib for Arduino to access ChatGPT back in 3.5 days :smiley: Was a fun little task, hopefully will get some spare time over the holidays to go experiment with realtime on a bunch of hardware in my dev cave.

1 Like

I made it work on the ESP32 S3 Muse Radio but it’s picking up it’s own response, I think webrtc has built-in noise cancellation but how to activate it?

1 Like

I’ve ordered these components from M5stack. Hopefully, that’s enough to try to replicate this :slight_smile:

  • Atomic Echo Base with Microphone and Speaker
  • ATOMS3R Development Kit with 0.85-inch Screen (8MB PSRAM)

m5stack [dot] com

(not sponsored or work for them)

I have tried the demo in the link above. I had difficulty getting it to recognize what I am saying (speaking in English). I notice it is set for a sample rate 8Khz. From my understanding the recommended sample rate for pcm16 is 24KHz with websockets. I thought this was the sample rate that the speech recognition model was trained for. From past testing I have had not had much success at 8Khz using pcm16. What is the optimum sample rate and settings with the opus codec. Also, the websocket documentation does not list the opus codec. Can opus only be used with webrtc?