Inspiration

  • After attending Will L.'s machine learning track at GHW: INIT 2023, we were fascinated by how easy it was to get started with natural language processing.
  • We also really wanted to get started with MATLAB after the introduction at LHD: Build 2022 and since we will be using MATLAB in some of our courses during the upcoming school year.
  • We came up with this project idea because we wanted to give creators better impressions to others so that they could succeed easier in their careers.

What it does

The user enters information about their channel and video and the program outputs the predicted amount of viewers on the video. This allows them to A/B test how well their video might perform based on factors such as title, thumbnail and category. It will help creators fine-tune their video preview cards in order to help them reach a wider audience.

How we built it

We used Kaggle's Youtube Thumbnail Dataset and YouTubers Saying Things dataset to get information about thumbnails, titles, subscriber counts and other predictors for the view count We then used MATLAB to clean our data. This involved converting timestamps to a useful format and engineering features to be used as an input into our neural network. Our neural network was created using MATLAB's Deep Network Designer where we trained and tuned the parameters. Lastly, we deployed our solution as a webapp on Gradio so that it can be used by the public.

How this project helps content creators

Finding the right title and thumbnails to attract the highest click through rate is not easy. A YouTube video does not depend solely on its content because if someone does not click on a video while scrolling through their recommendations, they would not get to watch the video at all and any effort to catch the viewer's attention would be lost. This is why the YouTube preview cards are so crucial to the success of a video, but it is difficult to find exactly what viewers are interested in and it can take days for a content creator to find the perfect thumbnail and title to suit a video. In a video published by Veritasium in August of 2021 titled "Clickbait is Unreasonably Effective", he interviews the well-known creator MrBeast on the importance of a title and a thumbnail and their correlation to a video's performance. During a conversation stating at timestamp 11:03, MrBeast mentions that a creator would not know what title and thumbnail would attract the most viewers, unless they were some sort of "almighty being that could just predict what people would be interested in". Even one of the largest creators of the platform has a real struggle with how his audience would react to a previews of his videos, despite being a YouTube creator for ten years. With our tool, content creators can save their time and instead focus on delivering quality content for their viewers. They do not have to worry about how their video's first impression will be viewed by their audience as they will be able to see an estimate of the video's success BEFORE the video is published and polish it beforehand.

Challenges we ran into

  • As we initially thought, the correlation between what the users see on YouTube previews is very noisy when relating to the number of views because of the natural diversity of content creators. Since we did not have lots of data to work with, we had to make some compromises. Our first machine learning model that we tried was a boosted decision tree, LSBoost (from MATLAB), and as a lot of noise was present, adding more predictors to the model quickly made it overfit. We then tried a fully-connected model which showed improvement.
  • Another problem was that we could not find any pre-existing data on smaller YouTube creators. This makes our model faced with survivorship bias as the model only discovers patterns for content creators who have a larger audience but does not have much knowledge on lower subscriber counts, so while our hack was initially designed for creators who want more recognition, the resulting project is more oriented towards creators who are already growing, but want to grow more.

Accomplishments that we're proud of

We are proud of being able to learn how to use MATLAB in just 36 hours. We built a working program and a neural network that accurately predicts video engagement.

What we learned

  • This was our first time touching MATLAB, as we've never used it before. We have watched videos, looked at tutorials and referred back to previous MLH livestreams to get a better understanding of both the programming language and the computing environment. This also meant that it was our first time building a neural network that was not using Python libraries, which allowed us to step outside of our comfort zone.
  • We also managed to connect two different programming languages together (Python and MATLAB), something we have never done before.

What's next for YouTube Creator Assistant

  • We want to expand our data to reach smaller YouTube creators to have a better sense of how lower subscriber counts affect views. This can be done by randomizing the creators we gather and not just focusing on the latest and trendiest YouTube channels.
  • We plan on giving the user suggestions on how to attract more viewers, instead of them having to enter different values at random, in order to see where they can improve.
  • We would also like to extract more information out of a thumbnail such as facial expressions, arrows, and other eye-catching elements.

Built With

Share this project:

Updates