Channeling OpenAI api output to face with lip movements

Hello! OpenAI community,

I have been trying to achieve a talking face in my browser(using javascript), which can utter words as we get text as output from the OpenAI api. Converting text to speech can be easily achieved by using pre-built APIs. But I was not able to find any library in javascript for achieving face movements. I had gone through some awesome research papers like this and this which does almost similar things, that I want to achieve. But they have more emphasis on generalizing any human face to make correct moves(lip movements), as per the input text, which increase their computational cost. Whereas in my case I just want to achieve a 2D cartoon face to have lip movements as per the text. Is it possible to achieve a lightweight model like mobilenet to achieve this?

1 Like

AFAIK only Triple-A game studios are experimenting with automated speech/mouth syncing. It’s the kind of thing that would be a closely guarded secret for a while yet. Mostly it seems like this is still something that is done as a post-production technique.

Anyways, this is the closest thing I could find specific to your query. A deep learning technique to generate real-time lip sync for live 2-D animation


I think Unreal Engine 5 is going to have something for lip sync… not sure about javascript…


Not javascript, but seems like this company deals with this issue:

(…though it is somewhat creepy, seems right in uncanny valley territory)

Useful discussion of Viseme and Phonemes here too: