Here is a Realtime Voice API Plugin for Unreal Engine and all-talking 3D Metahumans

robertb · October 18, 2024, 5:01pm

For fun, here is an Unreal OpenAI Plugin I’ve extended to include support for the OpenAI Realtime Voice API:

And here is rough phone recording of my first test of a realtime 3D Metahuman running in Unreal Engine with Audio2Face:

Face and head idle animations are not working yet and as I’m running Audio2Face on the same, 4090 RTX computer, I have to delay the audio by 367ms to match the generated lip-sync. Running the Audio2Face in headless mode on a separate computer should take that lip-sync delay down to 100ms.

robertb · October 19, 2024, 1:13am

I’ve done a second test with our Metahuman friend Ragnar, this time recorded as a 4K screencast with Unreal Engine in the highest “Cinematic” quality setting. Google Drive only shows 1080p preview, you’ll need to download the video for the 4K version: OpenAI_Realtime_API_Ragnar_Unreal_SeconTest_4K.mp4 - Google Drive

PaulBellow · October 19, 2024, 5:25am

Welcome to the community!

I’ve added a project tag for you. We ask that you keep your updates in this thread, so it’s easier for everyone to keep up to date.

I did some playing around with UE4 several years back, but nothing recently. Can you share with the community some of the struggles you had with the integration?

Thanks, and again, welcome! We have a few gamedev and gaming topics, but likely many more not correctly tagged… yet.

Hope you enjoy your stay!

robertb · October 19, 2024, 10:58am

Thanks, Paul. I found an open-source Unreal plugin for OpenAI that I extended, but to be honest, o1-preview did most of the heavy lifting for the Realtime API Plugin code. I fed the C++ headers and source code from the OpenAI Unreal plugin, along with the OpenAI Realtime Tutorial pages, into o1-preview, and it generated the Realtime part of the plugin in pretty good shape!

The next big challenge was integrating the plugin with audio capture, streaming, and turn detection within Unreal Blueprints. I used another excellent open-source Unreal plugin called Runtime Audio Importer and got great support from the author. Here’s the blueprint:

The biggest challenge was getting Audio2Face to work. I had to limit Unreal Engine to 25fps to keep everything running smoothly on a single RTX 4090. Getting the audio into Audio2Face also turned into a bit of a hack. A2F supports mic input or a gRPC stream, and while I plan to implement gRPC soon for running A2F headless, for this first demo I used a VB-CABLE Virtual Audio Device to patch the main audio output into a virtual cable, treating it like a mic input for A2F. The actual audio output was on a separate channel before hitting the speakers, which allowed me to add the necessary 367ms delay for A2F lip-sync, which I hope to reduce to 100ms when running A2F in headless mode on a separate 4090 RTX along with Audio2Gesture.

PaulBellow · October 21, 2024, 12:00am

Very cool to hear!

LLMs in general have gotten a lot better at coding. I saw something recently about o1 being best at strategy and coding tasks mostly.

Nice. Thanks for sharing. Hopefully, if someone stumbles across this thread, they might be helped a bit.

Please keep us updated!

syedhannaan97 · October 29, 2024, 9:11am

so how does the framework work and how did you push the audios to audi02face to get the realtime lipsync from the openai plugin ,can you drop a tutorial or just awritten framework on how the setup works

syedhannaan97 · October 29, 2024, 9:13am

is the latency as low as shown in the video or is it edited because from what i understand the frameworks has to be something like -audioinput converted to text and pushed to the api and response text is then converted to audio and pushed to audio2face where it is connected with live link and audio is played from the a2f instead

robertb · October 29, 2024, 10:42am

The video is not edited, and there is even extra 150ms-200ms latency as I’m running the UI version of audio2face on the same 4090 RTX computer as Unreal Editor - this latency will be reduced when you run audio2face in headless mode on a separate GPU. In this video the audio is delayed by 367ms so it is in sync with the lip-sync processing.

As we are using the OpenAI Realtime Voice API there is no text to convert, it’s voice to voice.

No tutorial unfortunately, I created the Realtime Voice API part of the Unreal OpenAI plugin in my free time for a demo. If you look at the Blueprint above then that will show you how I used the Runtime Audio Importer plugin to get the Realtime Voice playing as Unreal Soundwaves that can be interrupted from the server side VAD.

samirertt · November 18, 2024, 9:14pm

hello robert, i have been working on this project and the product you have got is amazing. Can you please share all the blueprint parts of your project so i could be able to work on it. Also can you share some ways for audio2face. am really new to unreal engine and these kind of tech.

samirertt · November 19, 2024, 8:33am

okey so robert i have understood what audio2face is and was successfully able to use it but i still dont get one thing that is how do you make the metahuman use the audio received from the gpt at runtime. thanks great work

robertb · November 19, 2024, 12:55pm

To setup the whole process on a standalone RTX4090 Win box here are some bullets:

Install the RuntimeAudioImporter Plugin to Unreal, from the Marketplace (or you can build the open source version)
Add my fork of the OpenAI Plugin to the Plugins folder of your Unreal project, then unreal will build it from the sources when you open the project again(works both on Linux and Windows, but audio capture is not working on Linux as it’s not yet supported at all on Linux).
Use the Blueprint earlier in this thread to set everything up, you’ll need to add your OpenAI API key to that Blueprint item
Install Audio2Face
Here is, frankly, a hack to have everything running on the same Win computer, which I needed for my demo, patching the Unreal Audio Out to a Virtual Cable to Audio2Face Mic In (the proper way is to use gRPC audio streaming into A2F)
Install V Cable and Voicemeeter from here: VB-Audio Virtual Apps
Setup V Cable to route output audio to Cable that can be used as Mic input for A2F
In Windows Sounds settings use Virtual Output Cable as Mic In
In Audio2Face
Choose a Ark Rigged face, there are two options, male and female
Choose Streaming mode and start recording, it will be silence to start with
Start streaming the animation, one of the options for A2F, you can also choose idle head animations but have not got those working fully
In Windows Sounds settings now use the real Mic as for the Mic input
Start Voicemeeter, this way you can again patch the out from A2F to the speakers, and set the output delay to around 300ms to accommodate the A2F processing time (if you run A2F on a separate computer in headless mode this delay should be <100ms)
In Unreal
Use this tutorial, or others, to setup LiveLink: https://www.youtube.com/watch?v=6QV3U5PENPM
Add the Realtime Blueprint (from above to your scene)
Press Play

andrei.zenith · November 21, 2024, 6:28am

Hi!Does this work on Mac?And if so ,where can I find the open source version for the RuntimeAudioImporter?

robertb · November 21, 2024, 11:22am

I’m not sure if this all will work on Mac, I haven’t tested it and never run Unreal on Mac , but here is the Runtime Audio Importer and it says it works on Mac: GitHub - gtreshchev/RuntimeAudioImporter: Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

samirertt · November 21, 2024, 9:24pm

hello robert, i have got the idea between the unreal engine and the audio2face but i have got a problem when i clone your repo it says the versions of the engine are not compatible can you specify which version does the plugin works. which version engine are you using while running the plugin. also i didnt understand the blueprint part what before the open AICALL Realtime other then the api key. so am i missing anything for the before open AICall Realtime. Thanks alot for your helps robert appreciate it.

samirertt · November 21, 2024, 10:08pm

also i couldnt find the settings on unreal engine to make the audio out to the virtual cable. i searched for settings on audio in the engine settings but i wasnt able to find it.

robertb · November 21, 2024, 11:40pm

Regarding Unreal Engine Version: I got everything working in Unreal Engine 5.4.4 using the pre-built Runtime Audio Importer from the Marketplace. However, to get Audio2Face functioning, I had to downgrade to Unreal 5.3 since Audio2Face currently doesn’t support Unreal 5.4 or 5.5.

Regarding Virtual Cables: There’s nothing specific to configure in Unreal for virtual cables. Follow these steps:

Open the Sound settings in Windows and set the system’s audio output to a virtual input cable.
Connect the virtual output cable as the microphone input in the same settings.
Start Audio2Face in Record or Live mode with this configuration.
Before pressing play in Unreal, go back to the Sound settings and switch the input back to the real microphone instead of the virtual cable. (Yes, it’s a bit of a hack, but it works!)
On the other end, use vMeeter to route the output back to the sound card. In vMeeter settings, you can adjust the output delay to compensate for the 300ms processing time required by Audio2Face.

samirertt · November 22, 2024, 10:07pm

hello robert, my issue on the plugin continues. it is not able to combile the plugin. i have tried it with 3 different versions of engine 5.2 5.3 5.4 none of them is able to combile. but when i get the plugin from the drive shared at the github page of yours there is no problem but the updates that you made are not present. if you could help me out to compile the plugin the whole project would finish thanks alot for your helps Robert.

robertb · November 22, 2024, 10:21pm

Happy to help!

You need my fork here: GitHub - rbjarnason/OpenAI-Api-Unreal: Integration for the OpenAI Api in Unreal Engine

I only tested this on versions 5.3 and 5.4.

You’ll need Visual Studio installed for the build tools and C++ compiler. I used the free version.

You will also need the Runtime Audio Importer installed either from source or as I did from the Unreal plugin store.

If you already have all that installed check the build logs, under %USERPROFILE%\AppData\Roaming\Unreal Engine\AutomationTool\Logs\

samirertt · November 23, 2024, 1:02pm

hello again robert but i am not able to compile the plugin. Its giving alot of problem can you compile it on a 5.3 version engine and then zip the plugin folder and send it.

I have already purchased runtime audio importor from the marketplace only the plugin compatibility is the problem.

robertb · November 23, 2024, 1:32pm

Sure, I’ve uploaded my Unreal 5.3 Win64 binary build of the OpenAI plugin here on Google Drive: BinariesPlugins - Google Drive

Topic		Replies	Views
Realtime API extremely expensive Feedback realtime	66	4598	December 4, 2024
Python integration of real time? API	13	2604	October 5, 2024
[Realtime API] AI Answering Gibberish API realtime , api-realtime , api-realtime-speech	9	397	October 25, 2024
Connecting to the Realtime API API	44	5260	December 14, 2024
Problems using session.update with the realtime-api (issue with "input_audio_transcription") Bugs api-realtime , api-realtime-speech	10	1203	October 15, 2024

Here is a Realtime Voice API Plugin for Unreal Engine and all-talking 3D Metahumans

Related topics