Inspiration
My inspiration came from Randi Zuckerberg when I attended her fireside talk. There was so much I wanted to ask her, but time was too short. I thought about other people who have past that could give the quality advice from themselves as successful individual. With the recent advances in deep learning and my interest in machine learning, I wanted to create an engaging and educational hack.
What it does
I built a web app that lets you talk to anyone and have deep conversations with AI. Chimeric creates an engaging and immersive experience to learn from. I hope this dramatically increases the immersion factor of the classrooms and lecture halls across all education levels.
How I built it
Using React and Next.js, this hosted the frontend of the app. Python Flask is used to run the backend server. GPT-3 generates responses, Google Cloud Text-to-Speech, AssmeblyAPI for Text to Speech, and Real-Time Voice Cloning to sync audio. First Order Motion Model generates dynamic videos from still images and Wav2Lip generates lip-synced with the synced audio. Lastly, we store that information on Google Cloud Storage.
Challenges I ran into
This project has been incredibly tough for me. Trying to make the response time quicker was tough because before, the app wasn't very immersive since it took too long to get a response. This was my first time using face detection models and I got a lot of OOM errors. I wanted better quality than the voice clone was giving me so I had to manually enter audio after massaging the waves.
Accomplishments that I'm proud of
This is my first time using a lot of the technologies in my tech stack. I wanted to come to HackHarvard to learn new technologies and I did just that. I am very happy with the AI-generated graphics. They feel very immersive and I learned a lot about the people I was using to train the AI.
What I learned
I learned how to use Wave2Lip and Google Cloud Storage. These are foreign technologies that I have never used before and it was quite satisfying to learn them.
What's next for Chimeric
I want to make a stronger user uploading pipeline along with creating more community features. Using more graphically processing power I can adjust the scalability to reach out to more people in the community. Lastly, I want to explore the ethics of deep fake technology.
Built With
- google-cloud
- gpt-3
- python
- react
- wav2lip

Log in or sign up for Devpost to join the conversation.