Inspiration
How many times do you visit a website such as Wikipedia, and end up reading heaps of content to find what you're actually looking for? Find++ aims to solve this issue by locating information on a page based on Natural Language Semantics and word relationships.
What it does
Find++ is a Chrome extension that helps users locate what they're looking for on a webpage, with non-exact matches. It has the ability to use pre-trained word embedding information to determine relations between words and their meanings.
How I built it
It runs on a Python and Flask API stack, with the Gensim library powering its semantic search. In particular, it uses paragraph embeddings to determine context, based on transfer learning on pre-trained results. The client is a chrome extension that can be installed on a user's Chrome browser and can be triggered on each webpage. Searches can then be made in a non-exact fashion.
Challenges I ran into
Integration of Flask APIs with Javascript and Chrome proved to be particularly challenging. Another challenge was that the matches were non-deterministic and were highly dependent on the amount and quality of the content on the webpage.
Accomplishments that I'm proud of
Getting something sizeable done in the time provided.
What I learned
Integration between Javascript, Python and Chrome extensions. Data cleaning and preprocessing. REST API creation with Flask. Applying embedding techniques to retrieve vectors for sentences.
What's next for Find++
Improve the accuracy of matches using sophisticated Deep Learning models. Information augmentation to help search on sites with lesser content. Implementing voice-based interaction.
Log in or sign up for Devpost to join the conversation.