Inspiration

Look, I know the feeling. You click on a clickbaity video, or perhaps a 2-minute tutorial, and instead of content, you're met with RAID SHADOW LEGENDS. What's terrible is that these ads are embedded in the videos, so there's no way you can accurately skip them. That's why we developed promoshun, which identifies if a video is largely promotion-based, fake, or real content. So next time, instead of using your time trying to find that timestamp where the content begins, just use our website!

What it does

Our website uses the Google Cloud Platform and NLP (Natural Language Processing) to determine the frequency of words used from within the captions of YouTube videos. Given this information, we are able to use machine learning to recognize whether or not a video is a promotion-based or original content. Another feature we added was the ability to compare YouTube videos to determine two YouTube videos and determine how similar they are. Let's say you are watching one video and you come across another that looks similar. By using this tool you are able to see how similar they are base on the extracted text. The higher similarity the higher the score meaning that video is worth watching.

How we built it

The frontend of our project was built with Flask (a Python framework), as well as HTML and CSS. The Flask frontend was connected to our Google Cloud project, which stored our machine learning model and natural language processor. We used Firebase Firestore to keep track of our data since we do not want to have to reenter data every single time we booted up the app.

Challenges we ran into

Though we originally planned to find which segments of the videos are promotion-based, we realized that such a task would be very difficult for a machine learning model without a large dataset, so we held it off. Another issue we was the need for our algorithm to be adjusted since it was doing too much false positives. This can be improved using a more weighted machine learning model

Share this project:

Updates