Here is a Realtime Voice API Plugin for Unreal Engine and all-talking 3D Metahumans

  1. There is an API call that I dont think that I implemented for closing the realtime session to OpenAI
  2. This is a bit tricky as the voice data streams much faster than realtime to the client from the OpenAI audio servers. I guess you could have some sort of Blueprint trigger that is trigger on the SoundWave when it has finished playing.
2 Likes

Sounds good, feel free to add a PR to my Fork of the OpenAI Unreal Plugin, I’ve also already made a pull request to the original creators root repo. I used o1-preview to program the Realtime additions to the plugin and I’m sure o1 or o3 mini high can help you quite a bit.

2 Likes

You might like this :
watch?v=uUDBwUx3feo

3 Likes

@robertb
Hey why we cannot use any other mic.
Why this plugin is working only with phone?

1 Like

The IAudioCapture interface choose the first Mic available, I think - but you can change the code here: OpenAI-Api-Unreal/Source/OpenAIAPI/Private/OpenAIAudioCapture.cpp at main · rbjarnason/OpenAI-Api-Unreal · GitHub

1 Like

@samirertt @robertb
I have found a solution for mic.
The problem is not with the default windows or input mic.
The main issue is that in this codes we are not converting the audio to 1 channel 44100hz.
Gpt realtime will get 1 channel 24000 or 44100 hz. All the mic that I’ve tested will Record 2 channel 44100hz.

3 Likes

Hello @robertb !
This plugin has been amazing so far. I have been working on implementing it all day and only need the real time api function for my uses. I’m on 5.3, using your plugin, using the marketplace realtime audio, and my blueprint looks like this: blueprintue. com/blueprint/8cypxk1e/.

My project will play and in the output log I am getting everything reading successful up until this point:

LogTemp: No create response message provided, skipping response create event
LogTemp: ===> Event Type: session.created
LogTemp: ===> Event Type: error
LogTemp: Error: Error received: Invalid event: failed to parse JSON value. Please check the value to ensure it is valid JSON. (Common errors include trailing commas, missing closing brackets, missing quotation marks, etc.)
LogTemp: WebSocket Closed: StatusCode=1000, Reason=Peer did not specify a reason for initiating the closing, WasClean=1

Nothing ever crashes, but nothing ever happens either. I’ll speak into the mic (tried several different microphones with both USB and 3.5mm jack connections) with no successful audio capture. I’ve also got my API key loaded and money loaded on the OpenAI account. Do you have any ideas?

Thank you so much!

2 Likes

@nrekow2 Are there any clues further up in the log file? Looks like the connection to OpenAI is not working; is the API key being transferred and setup correctly? The JSON failure could be due a null value coming in through the WebSockets.

1 Like

Just as an update and to anyone else who had the issue that I did: I traced my issue down to a ’ in the instructions that was not being decoded correctly and was being sent as a \ instead, causing a JSON parsing error. Everything works now and a big big thank you to @robertb for this amazing plugin :slight_smile:

2 Likes

Hi , is there a way getting your plugin work in UE5.5 is it possible to upgrade it?
Thank you

Hello, I’m having trouble getting the plugin to work on UE 5.4.

The plugin on the github page didn’t work for me in both UE 5.3 and UE 5.4. I saw that the user [samirertt] also had this problem and asked for a compiled 5.3 version. [robertb] provided a Google Drive download for this version so I tried that one and it worked for me.

How can I get it to work in my 5.4 project?

Hi everyone

@robertb Your Ragnar Unreal project Is fantastic congratulations and thank you for sharing.

If I download the OpenAI API Plugin, is that all I need in order to create a interaction like you recorded? I would love to be able to do something along the same lines but my coding skills are limited to Blueprints , is that doable?

BTW: with the same plug-in, can I do actions based on voice commands?

I did this Voice Recgnition with a very old Plug-in I found but it tends to fail a lot lot so no very nice.
I can’t seem to post the youtube video but you get the idea I think.

Thanks everyone.