Inspiration

In terms of inspiration, we thought of our grandparents of who we know have difficulty navigating on the internet, which is getting more complex everday with the emegernce of technologies. We all wish we could be there to help them, but time is a virtue and so we created an AI companion named Silver Surfer to help them in our place to navigate the web.

What it does

Silver Surfer is a chromium browser extension tool that analyzes what is on the page using both text and visual information to give context on what the user is seeing. It then can answer user queries and perform a variety of other actions including agentic navigation, injections to highlight changing text sizes across a webpage along with modifying any html page at the ease of the user's preference. Also, it allows the elderly user to request help from a family user, in which case an agent will send information to that family member as screenshots over a messaging platform of their choice.

How we built it

Our chromimum based extension tool is built using the Plasmo Framework combined with Tailwind styling to match its iconic comic theme, along with a NextJS backend that manages API calls. The general agentic model uses OpenRouter linked to the Gemini 1.5 Flash version to get end-to-end responses as accurate as possible. A stripped HTML of the webpage and a screenshot are sent as contexts to the model to understand what the user is seeing. The text-to speech uses ElevenLabs' voice, fine-tuned to be easily understood by elderly people. The platform is also secured using protected login via AUTH0 to make sure the user's information is saved and kept secure (best through a Google Account). Gumloop Agentic AI is used to automatically send screenshots of the elderly persons' page to their helping family member via a messaging app (which we demoed via Discord).

Challenges we ran into

We ran into a multitude of challenges. At first we wanted to use ElevenLab as the central model, however it wasn't great at receiving text, so we decided to pivot towards using Gemini as our general model, with ElevenLabs just being used for Text to Speech. Another challenge we went through, was with our backend where we had troubles using the langchain library with our original C# backend, which we fixed by converting to a NextJS backend. Inputting audio through an extension is not possible unless that extension generates an invisible page that then takes audio input. We also had issues with the AI models we trialed with, since our original tokens were getting low thereby defaulting directly to a worse model, which we fixed by using OpenRouter, where we could specify the exact model we want.

Accomplishments that we're proud of

The Agentic AI, being capable of deeply understanding the page the user is seeing, whilst being able to perform complex multi step tasks was a huge accomplishment in our journey in building this web extension. Another huge accomplishment was creating a beautiful and creative UI that matched our theme of Silver Surfer (originally a classic Marvel character - thereby matching the comic based style, which we transformed as connotating surfing the web, along with silver being a color commonly associated with old age). Silver Surfer transforms what can usually be a stressful experience for those unfamiliar with navigating modern technology.

What we learned

We learnt that different tools have their pros and cons, especially with our exposure to these new technologies. Working in such a diverse team, showed us how working with each other enabled us to strive for success, despite the drawbacks we faced with the challenges with certain technologies along the way.

What's next for Silver Surfer

We aim to improve Silver Surfer by implementing Fraud Detection across websites.

Built With

Share this project:

Updates