How to implement Real time lip sync of avatar chatbot, powered by gpt

I wanted to create a human chatbot that will listen to the questions
of users and answer it and lip of human will be synced with the answer.
so kind of real time voice converstional avatar interaction users can have.
I will be using GPT natural language converstion.

But i am not getting any solution how can i implement the real time lip sync of the avatar, what tools or models i need to use in order to achive this.

8 Likes

Look into HeyGen: Quick Start

1 Like

Interesting subject. You’ll be surprised that there are not many TTS solutions that support out-of-the-box lip sync.

Azure and Amazon TTS produce built-in lipsync data. It’s called viseme. Look them up.

You’ll find a lot of AI video solutions, but none of them are suitable for real-time animations.

2 Likes

There is a very comprehensive Facebook group on this topic at Virtual Beings, just Google it.

1 Like

Hey we have done a hackathon project before using RealTime lip-sync solution.
neat part is that it can be integrated as a react component. no complicated game engine setup
“github dot com /BennyKok/leaked-zoom”

you can ask any questions in their discord “discord dot gg / ZXKaZq4gMR”

5 Likes

Hi,

Did you make any good progress on this? We want to do something similar. Hoping you could help point us in the right direction.

1 Like

I am also interested in this, so please let me know if you come up with any solution

Interested as well. So far I found these solutions:

  • heygen
  • d-id
  • alibaba cloud (but i m not able to get in touch with them)
1 Like

Character API by Media Semantics (available on AWS Marketplace) offers real-time animation with lip-sync.

1 Like
  1. HeyGen
  2. Character API by Media Semantics
  3. Rhubarb Lip Sync
  4. Wav2Lip and its Extensions
  5. Vidnoz AI
  6. RealTime Lip-Sync Solution

This is what i found in the internet but implementing might have to do bit of googling.

3 Likes

Checkout app.aivah.ai by OpinionAI. Connect at www.opinionai.ai

1 Like

any more real time lip syncing providers like d-id and heygen?

we created a solution on this topic, but still dont be perfect:

https://mextrump-arxitjfija-no.a.run.app/

1 Like

Hello, great product. Is this open-source?

Kindly provide the sourcecode to create a chatbot with face

Hi, I’m working on a similar project but with a slight difference: mine involves generating video from text. The process includes taking a text input, generating audio from it using Text-to-Speech (TTS), and then using that audio along with a 3D lifelike avatar to represent or speak the text. So far, I’ve successfully implemented TTS using Coqui TTS, which is amazing for generating natural-sounding audio. However, I’m having trouble syncing the audio with the avatar’s lip movements naturally and accurately. I tried using Wav2Lip, but since I don’t have a GPU, the performance is poor. Could you please guide me on how to improve this?

1 Like

You would have to know some python, but I’ve done this with Nvidia’s Audio2Face. They have a sample extension that takes text, sends it to one of their TTS systems (Riva or something) and streams the returned audio into Audio2Face, which animated the face of a model. I just changed it to take the text you input, send it to ChatGPT or a local model, send that result to a local XTTS2 model (you can use whatever TTS, preferable one that can stream), and then that result is streamed into Audio2Face. I made a 3D model of a guy I know, cloned his voice with XTTS2, and you could chat with him and he would speak to you.

1 Like

Did you find any solution? Am interested in that too.

1 Like

Hi, what you have done sounds interesting. Any chance you could share the source code so that I can learn? I am new to this and trying to get my head around. Thanks in advance!

Hi, we are working on a similar kind of project & have achieved significant success in developing a proper model which gives real time response with lip syncs.

If anyone is interested in collaboration or to have this solution. Please do let us know.