Inspiration
We sometimes encounter individuals who face physical challenges with their limbs. Some are born with these conditions, while others develop disabilities later in life due to illnesses like strokes or accidents involving cars or motorcycles.
In particular, when someone’s hands are impaired or in cases of quadriplegia, they may be unable to perform even basic tasks on their own. Simple activities such as turning on a computer or conducting a search can become impossible, leading to isolation and an inability to meet essential needs.
We wanted to create an app that addresses these challenges by enabling individuals to use computers seamlessly, facilitating smoother communication with the outside world, and promoting an independent, self-directed life.
What it does
Our app is an AI-powered accessibility tool designed to help individuals with limited mobility navigate websites using voice commands. It allows users to control the mouse and keyboard by speaking.
With commands like 'Click the Button' to open the Google page and 'go to ---' to visit specific websites, the app offers a simple way for users to interact with computers, promoting independence and ease of access.
How we built it
We created a browser using Electron.js, utilizing Tinytune and OpenAI's Whisper AI to parse speech-to-text. We then pass in the raw HTML and user's prompt into our backend API, which then uses GPT-4o-mini to parse the HTML and get the appropriate selectors for whatever the user wants to select.
Challenges we ran into
Building a browser using Electron.js, although we thought it would be easy as it utilizes JS, was way more complicated than we thought. There are a lot more things to consider compared to creating a web app. Another significant issue is that passing HTML into an LLM requires a lot of tokens and takes a long time to get a response from GPT-4o-mini. We tried stripping the HTML to only tags like 'button' or 'a', but it deleted all the HTML elements by deleting a top-level tag like 'body', which also deleted any tags underneath it.
Accomplishments that we're proud of
Never in my mind would I have thought of building a browser during a hackathon.
What we learned
Don't build browsers or anything that requires tremendous preparation in a 24-hour hackathon.
What's next for SoundSurf
Probably consider making a Chrome extension instead or using more browser automation to simplify HTML parsing.




Log in or sign up for Devpost to join the conversation.