GuideCam | Devpost

User input interface (Braille keyboard)
App recognizes that a bottle is in front of it and can tell you how far away

Inspiration

In a world in which we all have the ability to put on a VR headset and see places we've never seen, search for questions in the back of our mind on Google and see knowledge we have never seen before, and send and receive photos we've never seen before, we wanted to provide a way for the visually impaired to also see as they have never seen before. We take for granted our ability to move around freely in the world. This inspired us to enable others more freedom to do the same. We called it "GuideCam" because like a guide dog, or application is meant to be a companion and a guide to the visually impaired.

What it does

Guide cam provides an easy to use interface for the visually impaired to ask questions, either through a braille keyboard on their iPhone, or through speaking out loud into a microphone. They can ask questions like "Is there a bottle in front of me?", "How far away is it?", and "Notify me if there is a bottle in front of me" and our application will talk back to them and answer their questions, or notify them when certain objects appear in front of them.

How we built it

We have python scripts running that continuously take webcam pictures from a laptop every 2 seconds and put this into a bucket. Upon user input like "Is there a bottle in front of me?" either from Braille keyboard input on the iPhone, or through speech (which is processed into text using Google's Speech API), we take the last picture uploaded to the bucket use Google's Vision API to determine if there is a bottle in the picture. Distance calculation is done using the following formula: distance = ( (known width of standard object) x (focal length of camera) ) / (width in pixels of object in picture).

Challenges we ran into

Trying to find a way to get the iPhone and a separate laptop to communicate was difficult, as well as getting all the separate parts of this working together. We also had to change our ideas on what this app should do many times based on constraints.

Accomplishments that we're proud of

We are proud that we were able to learn to use Google's ML APIs, and that we were able to get both keyboard Braille and voice input from the user working, as well as both providing image detection AND image distance (for our demo object). We are also proud that we were able to come up with an idea that can help people, and that we were able to work on a project that is important to us because we know that it will help people.

What we learned

We learned to use Google's ML APIs, how to create iPhone applications, how to get an iPhone and laptop to communicate information, and how to collaborate on a big project and split up the work.

What's next for GuideCam

We intend to improve the Braille keyboard to include a backspace, as well as making it so there is simultaneous pressing of keys to record 1 letter.

Built With

google's-speech-api
google's-vision-api

Submitted to

Cal Hacks 4.0
- Winner Best Google Cloud Platform Hack

Created by

I worked on converting speech to text using the Google speech API and also worked on setting up Google cloud services like the App Engine and the Cloud Storage.

Private user
I worked on taking an image from the webcam, processing it using Google's Vision API to determine what objects are seen, and calculating the distance of the object.

Areeba Turabi
Worked on making the Braille keyboard and editor IOS app. Worked on stitching together the camera and the ios interaction using App engine, python/bash scripting, and Google cloud storage.

Sravya Divakarla

Updates

Areeba Turabi started this project — Oct 08, 2017 02:43 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.