How to implement Real time lip sync of avatar chatbot, powered by gpt

ritanshu.eminence · November 29, 2023, 4:00pm

I wanted to create a human chatbot that will listen to the questions
of users and answer it and lip of human will be synced with the answer.
so kind of real time voice converstional avatar interaction users can have.
I will be using GPT natural language converstion.

But i am not getting any solution how can i implement the real time lip sync of the avatar, what tools or models i need to use in order to achive this.

fra_ab · November 29, 2023, 6:31pm

Look into HeyGen: Quick Start

anon5861895 · November 29, 2023, 10:30pm

Interesting subject. You’ll be surprised that there are not many TTS solutions that support out-of-the-box lip sync.

Azure and Amazon TTS produce built-in lipsync data. It’s called viseme. Look them up.

You’ll find a lot of AI video solutions, but none of them are suitable for real-time animations.

mendicot · December 2, 2023, 7:10pm

There is a very comprehensive Facebook group on this topic at Virtual Beings, just Google it.

jcchi · February 5, 2024, 9:32am

Hey we have done a hackathon project before using RealTime lip-sync solution.
neat part is that it can be integrated as a react component. no complicated game engine setup
“github dot com /BennyKok/leaked-zoom”

you can ask any questions in their discord “discord dot gg / ZXKaZq4gMR”

boozybob · March 17, 2024, 5:06am

Hi,

Did you make any good progress on this? We want to do something similar. Hoping you could help point us in the right direction.

stamatis.kourtis · March 26, 2024, 4:48pm

I am also interested in this, so please let me know if you come up with any solution

g.dinino · March 29, 2024, 8:35am

Interested as well. So far I found these solutions:

heygen
d-id
alibaba cloud (but i m not able to get in touch with them)

dougc · April 1, 2024, 12:36pm

Character API by Media Semantics (available on AWS Marketplace) offers real-time animation with lip-sync.

seofai · April 1, 2024, 10:30pm

HeyGen
Character API by Media Semantics
Rhubarb Lip Sync
Wav2Lip and its Extensions
Vidnoz AI
RealTime Lip-Sync Solution

This is what i found in the internet but implementing might have to do bit of googling.

OpinionAI · May 13, 2024, 2:53pm

Checkout app.aivah.ai by OpinionAI. Connect at www.opinionai.ai

eddiegainz · June 29, 2024, 8:48pm

any more real time lip syncing providers like d-id and heygen?

luisdemiguel · August 1, 2024, 1:38pm

we created a solution on this topic, but still dont be perfect:

https://mextrump-arxitjfija-no.a.run.app/

nasseralbess · August 20, 2024, 7:39am

Hello, great product. Is this open-source?

saeedusmani320 · September 8, 2024, 8:52pm

Kindly provide the sourcecode to create a chatbot with face

alilaptop393 · October 19, 2024, 1:27pm

Hi, I’m working on a similar project but with a slight difference: mine involves generating video from text. The process includes taking a text input, generating audio from it using Text-to-Speech (TTS), and then using that audio along with a 3D lifelike avatar to represent or speak the text. So far, I’ve successfully implemented TTS using Coqui TTS, which is amazing for generating natural-sounding audio. However, I’m having trouble syncing the audio with the avatar’s lip movements naturally and accurately. I tried using Wav2Lip, but since I don’t have a GPU, the performance is poor. Could you please guide me on how to improve this?

markrmiller · October 21, 2024, 4:26am

You would have to know some python, but I’ve done this with Nvidia’s Audio2Face. They have a sample extension that takes text, sends it to one of their TTS systems (Riva or something) and streams the returned audio into Audio2Face, which animated the face of a model. I just changed it to take the text you input, send it to ChatGPT or a local model, send that result to a local XTTS2 model (you can use whatever TTS, preferable one that can stream), and then that result is streamed into Audio2Face. I made a 3D model of a guy I know, cloned his voice with XTTS2, and you could chat with him and he would speak to you.

alex.tai.kh · October 23, 2024, 11:18am

Did you find any solution? Am interested in that too.

rafiul.nakib · October 24, 2024, 5:27pm

Hi, what you have done sounds interesting. Any chance you could share the source code so that I can learn? I am new to this and trying to get my head around. Thanks in advance!

alijunaid882 · November 18, 2024, 5:30am

Hi, we are working on a similar kind of project & have achieved significant success in developing a proper model which gives real time response with lip syncs.

If anyone is interested in collaboration or to have this solution. Please do let us know.

Topic		Replies	Views
Text-to-video generation using TTS for audio and a 3D avatar API	3	957	February 21, 2025
Channeling OpenAI api output to face with lip movements API	3	2670	July 2, 2021
Heygen streaming avatar connecting to my open ai GPT API	5	1121	April 20, 2025
Are there Gateway APIs for video? API	2	721	July 29, 2021
Best 'text-to-speech' api to plug into a chatgpt bot? API	4	1757	January 25, 2025

How to implement Real time lip sync of avatar chatbot, powered by gpt

Related topics