Cal-Hacks-2016

Media Online Sentiment Analysis Informing Citizens (MOSAIC)

This project is a web application built upon a Flask framework whose goal is to allow a user to search for a topic in the news articles produced by a specified news outlet (CNN, FoxNews, MSNBC, etc) and then see how that news outlets portrayal of a topic (rated by evaluating positive or negative sentiment of individual articles) has evolved over time. In addition, our product also allows them to graph the same topic's reported sentiment over time for multiple different news sources, allowing citizens to compare and contrast how different news sources present the same, to help them identify potential biases in the media that they consume.

We implemented two different back ends for this project, who had a couple of different advantages and drawbacks. The first implementation used two different Microsoft Azure APIs, while using actual REST API calls to get the data. This implementation used Microsoft Azure Bing Search to get articles from news sites based on certain topics, then used Boiler Pipe to extract the text from these articles, and finally performed sentiment analysis using Microsoft Azure Text Analytics API (under the Cognitive Services branch). Our second implementation used two different IBM Bluemix APIs, specifically two Watson services called AlchemyNews and AlchemyLanguage. Since we ran out of actual API calls for the Watson API very early on, we were unable to implement this using actual REST API calls. Our workaround was to use an AlchemyNews and AlchemyLangauge demo page to input our queries and then scrape the results from the resulting webpage, storing the results of the query in a SQLite database.

The front end remaining consitent between the two back end implementations. We used matplotlib to make a simple plot of the sentiment data we received for each of the given dates over a certain time range. We plotted sentiment scores vs. date and then published that plot to the user. The goal of graphing these sentiments over time was to help citizens find the variations in each media outlet's portrayal of a certain topic over time.

The ultimate goal of this project is to provide a tool that will inform consumers of media about the biases and leanings of sentiment of the news providers that they read. We both learned an immense amount working together on this project and we hope to work on improving it and making a nicer user interface in the future.

This project was done at Calhacks 3.0 over the weekend of Nov 11-13

Share this project:

Updates