hackcu-clickbait

Inspiration

Browsing the internet and getting the content you are looking for has actually become a challenge in today's world. Every news outlet breaks an article down into multiple pages for extra ad revenue.

What it does

We trained an ML model to power a Chrome Extension to make scrolling Facebook a better experience. The goal was to eliminate titles such as "12 reasons why you should..." and

How I built it

We utilized Python BeauitfulSoup, CheerioJs, and the Reddit API to scrape articles to train an ML model. We used a node backend to set up an API to talk to Amazon ML to determine wether or not an article is clickbait. From there it communicates with a chrome extension to pull article links and headlines while the user is scrolling facebook. After this in real time, we used Node and Cheerio to scrape the source of the article. This accomplished several things. We can combined multi page articles down to a single reading experience with the hopes of injecting the new content directly into a users Facebook feed after passing through a summarization API called summary.

Challenges I ran into

Facebook was very blocking with the scripts we could inject. No two news websites are the same so scraping the articles themselves was a large challenge. On top of that we had to compile together a dataset to train the ML model.

Accomplishments that I'm proud of

Some crazy Regex Real Time API Accurately identify clickbait articles Scraping data from news sites that were not alike.

What we learned

Amazon ML NodeJS Express jQuery Chrome API Reddit API

What's next for hackcu-clickbait

We are very close to having a product worth being very proud of that will hopefully catch some traction. We are excited to move forward with the project.

Built With

Submitted to

HackCU III

Created by

I worked on putting the front-end together -- creating the Chrome extension, injecting the result data into the Facebook feed, and seaming together the front and back ends.

Lauren Raddatz
I worked on most of the scraping for bother the ML training, and for the article summaries. I learned some beautiful soup, and wrote some intense regex. This also included learning nodeJS as well as some cheerios for the back end that fetched and parsed articles.

Justin Olson
I worked on training the Amazon ML model, creating evaluations for the model, creating the node backend for the app, as well as gathering summaries for urls.

Patrick Zawadzki

Updates

Patrick Zawadzki started this project — Apr 23, 2017 06:35 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.