Inspiration

When ideating on Friday, we were inspired by the topics around providing more accessibility using bleeding edge technologies. We knew that we wanted to make something genuinely cool and technically challenging, but also something that provides real value to underserved users. We decided to target impaired individuals, as 1 in 9 Americans are physically impaired to some degree, but are underserved. We saw a huge problem with the current offerings in the accessibility automation space -- and found a problem that was technically challenging but rewarding to create.

What it does

SpeakEasy is a fully featured AI-powered browser automation tool. It allows you to browse the web and get information without needing to touch or see your browser at all.

How we built it

This project revolves around several different AI agent 'actors' equipped with different tools. The user interfaces with a conversational assistant using language and voice models that provide a voice interface to 'talk to' sites with and navigate the browser, which sends commands to the browser agent. This browser agent creates a comprehensive knowledge base from each and every site using different segmentation and vision models, providing a deep understanding of what elements can and should be interacted with. This allows us to compile the site down to the core needs of the user and give the user information about the next steps to take while navigating.

Challenges we ran into

Traditional large language and multi-modal models simply didn't give us anywhere near the results we wanted, they were much too generalized and inaccurate for our use. Our biggest challenges lied with both sourcing and fine tuning different models, some of which worked, some of which did not. This was an incredibly time consuming process, and for quite a while we were unsure that this idea would even be able to be executed with the time and resources we had. We had to take quite aggressive approaches with blending different techniques to get the results we wanted.

Accomplishments that we're proud of

Making it work was definitely the best part of our weekend! The first automated browser session we had was truly a breath of fresh air to show us that idea was at the very least, somewhat valid and possible by the end of the hackathon.

What we learned

This was definitely a great experience to try out a ton of different ML models and blend these with traditional scraping & crawling techniques to not only quickly -- but even more accurately get the results we wanted.

What's next for SpeakEasy

The fact that this can be done should inspire a lot of people! We live in a world where we can make truly revolutionary and applicable projects that could genuinely benefit people, in just 36 hours! We'd love for you to star and try out the repo for yourself, there's detailed instructions in running the project in the README.

Built With

Share this project:

Updates