Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
In the second stage, an audio-driven talking head generation method is employed to produce compelling videos privided the audio generated in the first stage.
Tags:Paper and LLMsTalking Head GenerationPricing Type
- Pricing Type: Free
- Price Range Start($):
GitHub Link
The GitHub link is https://github.com/zhichaowang970201/text-to-video
Introduce
This GitHub repository, titled “Text-to-Video,” presents a two-stage framework for generating talking-head videos without requiring the person’s identity. It includes various components like Text-to-Speech models (Tacotron, VITS, YourTTS, Tortoise), Audio-driven Talking Head Generation methods (Audio2Head, StyleHEAT, SadTalker), and VideoRetalking. The repository provides links to the code and assets for these models, facilitating research and development in zero-shot identity-agnostic talking-head generation.
In the second stage, an audio-driven talking head generation method is employed to produce compelling videos privided the audio generated in the first stage.
Content
KDD workshop: Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation









