Inspiration

Everybody knows that seniors are vulnerable to phone scams. The members of our team knew multiple elders who have fallen victim to these attacks. We formulated an idea for how to fix this, and we seemed to be the first to have a scalable, effective way to do this.

What it does

We built a proof-of-concept prototype of the Deception Defender. In its current state, the user inputs an audio file into the program, then clicks "analyze". The audio is fed to Whisper V3 "small" running on your device, and is transcribed into text. This text is piped into Gemma 2-2B, a small language model AI also running on-device. Gemma reads the transcript and alerts the user of any possible scams. The Deception Defender reliably differentiates scam calls and non-scams. This process is omni-lingual.

How we built it

We used OpenAI's whisper V3 "small" model to transcribe the audio. Whisper is open sourced. We also used Google's Gemma 2-2B LLM model running on LM Studio, similar to ChatGPT, but open sourced. The GUI runs on PyQt5.

Challenges we ran into

Setting up our code editor to run whisper was unexpectedly difficult. There are a few strange dependencies, such as FFMPEG, that were not clearly needed. Gemma 2-2B is a very small language model developed by Google Deepmind, so its instruction following capabilities are limited. A significant effort was put into our system prompt.

Accomplishments that we're proud of

The Deception Defender team has made 3 significant achievements with developing our program, and we find our product to have 2 unique capabilities not seen current state-of-the-art software.

Using the first principals method of engineering, we found three criteria must be met for a system like Deception Defender to be expanded and adopted worldwide:

  1. Logistics - The program must not take large servers nor large amounts of work to maintain. We solved this by running the model on-device, being able to maintain the database in a simple text file prompt.
  2. Economics - The program must be free or nearly free to run. We solved this by running open sourced models locally.
  3. Privacy - The program must not send all of your phone calls to an external server. We solved this by running the entire process on-device.

Our proof of concept demonstrated two surprising capabilities we did not expect to work as well as they did:

  1. Reliable detection and low false-positives: The Deception Defender reliably alerts the user for positive scams, and reliably does not alert the user for non-scams. This seems to be a product of our system prompt, alongside the reasoning capacity of Gemma 2-2B.
  2. Omni-lingual: With no modification at all, the Deception Defender works in all common languages. The Deception Defender team did not expect nor explicitly program this feature into our product, and only learned that this works by accident. During testing of our program, one of our members got a real life phone scam call, but in Mandarin. We recorded this audio and sent it to Deception Defender, and it still functioned properly. Interestingly, the Deception Defender responded with an alert written in English. This is likely a product of the English system prompt.

What we learned

The Deception Defender marked many firsts in our teams programming history. This was the first time we have ever used a proper GUI for a project (we used PyQt5). This was also the first time our team has used whisper-v3 or an AI model like Gemma integrated into a project. We learned a lot about integrating the different technologies into our project.

What's next for Deception Defender

Built With

Share this project:

Updates