GooseGuard | Devpost

Landing Page
Try/Upload Page
Dashboard Page
GooseGuard Logo

About the Project

We were inspired to create this project because, like everyone else, we are constantly bugged and frustrated by call scammers who seem to be more persistent than ever. However, the situation became personal when one of our team members walked in on their parents being scammed out of money. Shocked that even their tech-savvy parents could fall for such deception, we realized the severity of the issue and wanted to develop something that could effectively tackle these scams.

How We Built the Project

We built the program using React/Next.js for the frontend and Convex, a sponsor tool, for the backend. Using Convex presented its own set of challenges, as we initially found it difficult to learn under the time crunch of this hackathon. However, we persisted with it due to its ability to quickly set up a backend database and its built-in potential for Google API authentication. Unfortunately, we were unable to get the Google API authentication fully operational.

For identifying scam content, we utilized the BERT (Bidirectional Encoder Representations from Transformers) Transformer Model. The model was fine-tuned on two large datasets of spam calls and spam texts found on Kaggle. We were asked to dream big for Hack the North and we did just that! Our model consists of 340 million parameters! It achieved an impressive accuracy rate of 96%. The model processes input by tokenizing it, running it through BERT, and determining whether the input is likely to be a scam based on known patterns and scam scripts.

To transcribe audio from calls to text, we used another sponsor tool, Groq. Groq significantly decreases the translation time when compared to running on a large language model (LLM). This allows us to efficiently convert audio input into text, which is then processed by our BERT model. This integration significantly speeds up the workings of the language model, providing quick and accurate scam detection.

GooseGuard's Multi-Layered Scam Detection

We designed GooseGuard to provide the most robust and accurate scam detection across phone calls, text messages, and emails by implementing multiple layers of scam detection:

Script Analysis: The script is analyzed against our fine-tuned BERT model to detect if it follows known scam patterns.
Personal Information Requests: Most importantly, we flag the input if it asks for any personal information, as this is a common trait of scams.

Time permitting, we also planned on implementing the following layers of scam detection:

Area Code Analysis: If a phone number is provided, we first analyze the area code as an initial signal of a potential scam.
Robocaller Detection: We then check if the call is from a robocaller, adding another layer of validation.

These layers work together to improve the model's accuracy in identifying scam content, providing a more comprehensive protection mechanism.

Challenges We Faced

One of the main challenges was learning how to use Convex effectively. While it required a bit of a learning curve, we found it valuable for quickly setting up a backend database and exploring features like Google API authentication, which we worked to integrate into our project.

Another challenge we encountered was understanding the limitations of mobile development. Initially, we spent a significant amount of time working towards developing a mobile app. However, we soon realized that the security restrictions on iPhone and Android would make retrieving emails, text messages, and phone calls a highly complex and slow process. This realization led us to pivot towards building a web application, a decision we made late into the hackathon.

Finding a holistic dataset to fine-tune our BERT model presented another challenge. After researching and going through a lot of literature on relatable datasets we came across two that resulted in the best outcome.

What We Learned

Throughout this project, we learned a great deal about machine learning, backend development, and scam detection patterns. Utilizing BERT for natural language processing and Groq for audio transcription proved to be both highly efficient and rewarding. It has been an enriching experience to build a tool that can potentially help protect individuals from online scams, giving us a sense of positive social impact.

Working together in a short 32-hour window brought us closer as a team. It was fascinating to learn about each other’s different experiences and perspectives, forming strong bonds that we will carry beyond this hackathon. Additionally, meeting a diverse array of students from across the globe—representing all populated continents—was inspiring and contributed to our growth. This collaboration and shared mission for positive impact have made this hackathon an unforgettable journey.

Future Development

Moving forward, we plan to continue implementing the layers of scam detection and fine-tuning our model for even greater accuracy. We also aim to focus on robust authentication features to further secure our platform. Most importantly, we intend to develop a mobile application better suited for our target population, as our phones are the primary contact point that scammers use through calls, messages, and emails.