Inspiration

Our initial inspiration stemmed from our collective experiences in looking for specific (YouTube) videos and finding out it wasn't in a language we could understand. This can be frustrating, especially when we find ourselves relying on YouTube to learn new materials for both personal and academic purposes.

We hope that the LanguageAlchemy project will serve as a tool to break down language barriers, fostering better communication and understanding. We aim to leverage technology to create a solution that contributes to a more connected and inclusive society.

What it does

LanguageAlchemy is designed to translate YouTube videos into over 25+ languages, based on user preferences. By utilizing the OpenAI Whisper model, the application ensures accurate and contextually relevant translations. Users can input a YouTube URL, select their desired target language, and the system will detect the language being used in the video, to generate translated transcriptions, and in addition a short summary of the contents.

How we built it

The LanguageAlchemy project was built using a combination of web development technologies. The frontend was carefully crafted with Streamlit for a clean and user-friendly interface while the backend leverages the OpenAI Whisper model for translation. The integration with the YouTube DLP API facilitates seamless video processing and the use of HuggingFace provides a summarization of the video. The collaboration between frontend and backend components ensures a smooth user experience.

Challenges we ran into

During the development of LanguageAlchemy, we encountered several challenges. We initially ran into an issue with the compatibility and limitations of various APIs to our system, such as Google Translate and Youtube Transcipts API. Pivoting to using OpenAI's Whisper machine learning model raised difficulties in integration with the front end design of Streamlit and obtaining a usable API key without having to pay high fees for the service. We quickly found that relying solely on Google Drive was not viable, necessitating the exploration of alternative methods for local-to-web data transfer. Attempting to solve the problem with ngrok, a tool for creating secure tunnels for sharing your localhost to the web, we faced errors with the newly added complexity of the deployment process. However, despite these challenges, our team persevered and navigated through them throughout the night — gaining valuable problem-solving skills along the way.

Accomplishments that we're proud of

Although we faced numerous challenges throughout the hacking process, we were able to successfully deploy the web application working as intended! The results of the model were much better than anticipated — as a team formed of multi-lingual members, we each personally tested the accuracy and precision of the language translations, and we were surprised with how well it was able to pick up the language in the video and obtain sensible translations.

What we learned

Exploring many different tech stacks throughout the 24 hours, we were able to touch upon many new technologies such as Streamlit, Figma, Javascript. Not only were we able to learn more about various machine learning models, web development techniques, and the software development cycle, but we were also able to execute an idea from start to finish. Additionally, as we were a team of interdisciplinary students, we gained experience in working with teammates of different backgrounds and skills.

What's next for LanguageAlchemy

Although LanguageAlchemy is working as intended, the processing time between when the video loads, transcriptions and summaries are generated is not as optimal as we would like it to be. Thus, we will do further research into how to increase processing times for an enhanced user experience.

** Note: We have 1 more team member: Vansh [Unknown Last Name] who could not make it to submission deadline **

Built With

  • huggingface
  • ngrok
  • openai
  • python
  • streamlit
  • whisper
  • youtube-dlp
Share this project:

Updates