FictionFilter

We use NLP and Machine Learning technology to detect the veracity of articles, fact-check claims made inside them (cross-referencing sites from across the internet), and help people determine whether their news source is trustworthy.

How it Works

Our approach is multi-pronged.
First, we use an LSTM to classify the trustworthiness based on the structure and words used. This technology is state-of-the-art when it comes to text classification, allowing us to predict the trustworthiness of even individual sentences without relying on potentially biased and incomplete lists of "which news sites are credible," especially since many news sites contain a mixture of truthful and unreliable content.
In addition, we use an NLP-driven approach to isolate the most important claims in the article in a unique way via extractive summarization, allowing us to find the most important statements in the article and search fact-checkers for them. This means if the article makes a claim that shows up on Snopes, Politifact, or a host of other verified fact-checking websites, we'll be able to get the truth value.

Robust

Because we use powerful and robust techniques to extract the text from the site given to us and verify it server-side in addition to using one of Google's search APIs, we have multiple measuring sticks with which to determine the trustworthiness of an article. This helps protect us from source bias or discrimination against any one particular writing style by using multiple methods together.

Relevant

The Internet makes it easier and easier for people to share the truth. Unfortunately, it has a similar effect on lies. Fake news spreads rapidly, as does satire disguised as truth. This can broaden the widening gulf in our political climate by reinforcing echo chambers to not even care about reality. In addition, dangerously misinformed citizens make worse decisions. By arming people, especially the less media-literate, with the tools they need to ensure the media they consume is accurate, we can create a more informed populace, better able to combat hoaxes about topics like COVID-19 and other current events.

Technology we Used

We used TensorFlow and Keras to create our LSTM neural network, as well as the powerful nltk library for preprocessing. We also used an extractive text summarization technique to give readers digestible chunks of information about the articles they're reading.