Text-to-video generation using TTS for audio and a 3D avatar

alilaptop393 · October 19, 2024, 1:41pm

Hi, I’m working on a similar project but with a slight difference: mine involves generating video from text. The process includes taking a text input, generating audio from it using Text-to-Speech (TTS), and then using that audio along with a 3D lifelike avatar to represent or speak the text. So far, I’ve successfully implemented TTS using Coqui TTS, which is amazing for generating natural-sounding audio. However, I’m having trouble syncing the audio with the avatar’s lip movements naturally and accurately.

chef1 · February 20, 2025, 5:43pm

Hi @alilaptop393 . I too am trying ways to visualise TTS from OpenAI. Did you have any success with this? Have you found any 3D avatars . mascots or similar you can recommend?

Thank you

alilaptop393 · February 20, 2025, 6:06pm

I hope you are doing well.

Regarding your question, I haven’t had any success in visualizing OpenAI’s TTS yet. However, I am also exploring different approaches. Could you clarify what kind of 3D avatar, mascot, or visualization method you are looking for? Are you interested in real-time lip-syncing, animated characters, or a specific software/tool for integration?

Looking forward to your response.

Best regards,
Mohsin Ali

chef1 · February 21, 2025, 6:21pm

Thanks for your reply.

Well I’m looking to use HeyGen. The technology is available and cheap. The docs are there but over the head of a no-coder like me.

Topic		Replies	Views
Heygen streaming avatar connecting to my open ai GPT API	5	1758	April 20, 2025
Channeling OpenAI api output to face with lip movements API	3	3011	July 2, 2021
Transcript--> Translate->Text to Speech API api	4	2197	November 10, 2023
AI-custom friendship bot Community	7	2152	December 15, 2023
Virtual assistant with video and audio API gpt-4 , chatgpt , api , assistants-api	0	288	September 30, 2024

Text-to-video generation using TTS for audio and a 3D avatar

Related topics