Here is a Realtime Voice API Plugin for Unreal Engine and all-talking 3D Metahumans

I only got the open source Runtime Audio Importer running on Linux using 5.4 but with no working audio capture. But I got the plugin for 5.3 on the Unreal Marketplace (now Fab I guess) for Windows. Here is the link and there is an excellent Discord support channel. GitHub - gtreshchev/RuntimeAudioImporter: Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.

1 Like

hi, i can’t seem to find the Append Audio Data From Raw node, i already bought and enabled the runtime audio importer plugin

Not sure on why the Append Audio Data From Raw node does not appear, maybe it’s in some right-click context, I’d consult the Discord group of the plugin.

hi robert, can i set this up on the web. or export all of this to the web such that i can access it online.

Hi @robertb
Thank you so much.
I already set everything up, but then the character just repeats whatever i say, maube its not processing the audio correctly, it repeats the prompt.

Any help with this?
@samirertt did you get everything to work,
i need a little help with this please

If the character repeats what you say maybe there is something not correctly setup in the Blueprint? One missing detail from what is presented in the thread above might cause something like this in terms of how the audio is routed. The OpenAI Realtime Plugin captures the audio and sends it to the OpenAI API directly, that aspect is not dependent on the Blueprint routing.


here’s my blueprint, i can’t seem to find anything missing.

someone from the runtime audio importer discord said this " If you’re capturing the audio data within that capturable sound wave, you should call the ReleaseMemory function after the capture is finished, so that the appended audio data is the only audio data inside the capturable sound wave, without the previously recorded speech"

i’m yet to make sense of it

Don’t see anything wrong on the Blueprint at a quick glance.

I’d advice going through this thread one more time step by step to compare to what you have done in your setup. This is all quite complex and experimental, but this long thread above does provide enough information to replicate the experiment, as proven by at least one person. For me this was an experiment, working on something else at the moment, but will update this thread if I get back to this project and will share any updates.

1 Like

@robertb hey man any update on android audio capture i am trying to get this working in Meta Quest 3

@muqeeta2 The OpenAI Plugin Realtime part uses the standard built in Audio Capture interface from the Unreal SDK so should work anywhere audio capture works in general in Unreal Engine. I don’t know much about Unreal SDK, this was my first test project with it so not sure about Android support.

This is fantastic! I don’t have much experience with Unreal yet, but do you know if it is possible (as in not totally crazy) to run this type workloads on a server. We are building a virtual assistant and of course you want to have more than one conversation at a time.

@leonid.sokolov Yes, it’s possible to run both Audio2Face on the server in headless mode and also Unreal with Unreal Pixel Streaming, that you can also run on the server.

1 Like

Hi! I followed the steps on the github page but I keep on getting the error message “failed to get ‘text’ field from json response”.


It seems that I have set up the nodes as in the example, but am unable to get a conversation going, just one reply and then nothing after that, any ideas why that could be?

No ideas on that, unfortunately.

Try to look at the log from Unreal for the level, if there is an error message there.

Ive used the grpc stream and headless mode to do this and the python code

Utube: 8FWriraLjGQ?

I do everything in python and just use live link for metahuman and a websocket to communicate with us

2 Likes

That looks really good! :slight_smile:

hey robert i’ve started to edit your realtime.cpp to make it better, I’ve just added a blueprintable tag to your ueclass that make it variable in blueprint, and I’ve start adding new openai available voices. I just wanted to contact you directly cause I need this part of the plugin and it can work better, but it needs to be customized a little. so I want to know your thought. thanks.
@robertb

and i have another question that i cannot understand from your codes:

  1. how can we close the session (manually or by time or how this will be closed?)
  2. how can i understand when the audio completely received? because I’m using manual animation and fake lipsync.