Inspiration

Being involved in NLP projects from research and open-source projects led me to continue building useful apps with NLP.

What it does

KeywordFS analyzes words across multiple uploaded files and determines the most meaningful ones to the content of each file. Furthermore, users can filter specific documents based on certain keywords rather than searching through text files 1-by-1 on their own.

How we built it

The app was hosted on a datascience dashboard framework called Streamlit. The main library that supported the backend was KeyBERT.

Challenges we ran into

Topic modeling was originally planned to be an extra feature, but that was scrapped due to incompatibility with the app framework.

Accomplishments that we're proud of

Applying BERT for the first time in a data science application that is fully hosted.

What we learned

Learning about new deep learning libraries geared towards application was the biggest takeaway.

What's next for TopicFS

Potentially find another library for topic modeling and maybe extend semantic search to other file types such as images and PDFs.

Built With

Share this project:

Updates