Inspiration
The inspiration for ClusterMail came from the growing need for better email organization and visualization. With the increasing volume of emails people receive daily, traditional inbox structures often fall short in helping users manage their communications effectively. We were inspired by the concept of topic clustering and graph visualization to create a more intuitive and efficient way of handling email workflows.
What it does
ClusterMail automatically creates clusters for topics in your email and allows you to view these topics and their connected emails in a graph format. It provides a visual representation of your email universe, making it easier to navigate and manage your messages. The system uses machine learning algorithms to group similar emails together, creating a more organized and accessible inbox experience.
How we built it
We used Next.js for our frontend and data that closely faced the user. We used an additional Python webserver, built with FastAPI, to host BERTopic and handle the vectorization of data.
Challenges we ran into
Running BERTopic in a FastAPI web server proved challenging to do due to many issues with dependencies.
We ran into several issues dealing with rate limiters on the public facing APIs that we were using. Classifying an entire inbox required many requests to these APIs, which we were able to slightly counteract by using batching. However, dealing with the heavy compute time of the BERTopic transformer model also proved challenging, meaning that the time from a user's first login to seeing their email graph was significant.
Accomplishments that we're proud of
We were able to persevere through several setbacks and put together a cohesive product. We successfully integrated BERTopic with our platform, and we used React-Flow to create an appealing an interesting user interface.
What we learned
Exposing machine learning models for use in applications through web servers. We learned how to use new tools and technologies.
What's next for ClusterMail
We hope to put polish on the experience and refine our clustering to provide more fine-grained and actionable insights based on related emails. We also want to more heavily consider the time of the email in how they are weighted when clustering.
Built With
- bertopic
- docker
- fastapi
- gmail
- google-gmail-oauth
- nextjs
- numpy
- openai
- pandas
- postgresql
- python
- pytorch
- react
- tailwindcss
- typescript
Log in or sign up for Devpost to join the conversation.