Inspiration
We've always been inspired by the challenges faced by those with disabilities, particularly a close acquaintance who is blind. Their daily commute and navigation are fraught with inconveniences. With the recent advancements in Large Language Models (LLM) and computer vision technology, we saw an opportunity to make a significant difference. While smartphones are powerful tools and can act as the guiding dog, they aren't practical for continuous use, especially when navigating. This led us to conceptualize the GooseGuide.
What It Does
GooseGuide is more than just a robot; it's a companion. Equipped with a camera, it scans its surroundings and uses computer vision to identify objects. But what sets it apart is its ability to converse naturally. Users can interact with GooseGuide just as they would with a human guide. They can give commands like "Can I walk forward 10 steps?" or "Can you take me backward right for 5 meters?". Using Cohere's API, these commands are translated into vectors of distance and direction, guiding the robot accordingly. As it moves, it communicates with the user, alerting them of obstacles and ensuring their safety. In essence, GooseGuide serves as the eyes for the blind, walking ahead and providing real-time feedback.
How We Built It
Our journey began at Hack The North, where we sourced all our hardware components. The heart of GooseGuide is its computer vision capability, powered by technologies like YOLO. This allows it to scan and recognize objects in its path. To understand and process user commands, we integrated Cohere's API, which breaks down speech into actionable vectors.
Challenges We Ran Into
Merging various technologies was no easy feat. We faced hurdles in integrating the robot's camera with the computer vision system, often experiencing lag. Initially, the robot connected to a computer via its own Wi-Fi, but this setup hindered our interaction with the Cohere API. We also aspired to incorporate real-time GPS services. However, due to hardware limitations at the hardware distribution center, we had to pivot and transmit data using WIFI ESP32. Time constraints further compounded these challenges.
Accomplishments We're Proud Of
Despite the obstacles, we're immensely proud of what we've achieved. GooseGuide is not just a project; it's a solution with real-world applicability. We've managed to integrate multiple AI applications seamlessly, showcasing the potential of both computer vision and LLM. More than the technological achievements, we're elated at the prospect of our creation genuinely aiding the community.
What We Learned
This journey was a treasure trove of learning. We delved deep into robotics and discovered the intricacies of melding computer vision with LLM. The challenges we faced taught us resilience and the importance of adaptability in innovation.
What's Next for GooseGuide
The future is bright for GooseGuide. We envision a sturdier and more advanced robot body. Incorporating GPS tracking is high on our agenda to enhance navigation. We're also keen on launching a web platform to display real-time locations. On the technical front, we aim to refine our computer vision model for efficiency and foster a more intuitive interaction between the user and the robot. Ultimately, we hope to design a robot capable of carrying heavier loads, further enhancing its utility for users.
In essence, GooseGuide is not just a technological marvel; it's a beacon of hope for those navigating the world without sight.
Slides Deck
https://docs.google.com/presentation/d/1S0Fz4J9ShYt7beCtwMhWgSWF1FXCER99/edit?rtpof=true
Built With
- cohere
- openai
- robomaster
- whisper
Log in or sign up for Devpost to join the conversation.