Here is a Realtime Voice API Plugin for Unreal Engine and all-talking 3D Metahumans

The video is not edited, and there is even extra 150ms-200ms latency as I’m running the UI version of audio2face on the same 4090 RTX computer as Unreal Editor - this latency will be reduced when you run audio2face in headless mode on a separate GPU. In this video the audio is delayed by 367ms so it is in sync with the lip-sync processing.

As we are using the OpenAI Realtime Voice API there is no text to convert, it’s voice to voice.

No tutorial unfortunately, I created the Realtime Voice API part of the Unreal OpenAI plugin in my free time for a demo. If you look at the Blueprint above then that will show you how I used the Runtime Audio Importer plugin to get the Realtime Voice playing as Unreal Soundwaves that can be interrupted from the server side VAD.

2 Likes