Inspiration

We were both interested in doing something with computer vision, but it was new for both of us and we thought it was cool. However, since we were just a team of 2 at the time, we decided to go look around for other team members to join us. As we talked with more and more people, we narrowed down specifically what we wanted to do, and we eventually found a few problems that can be solved with a single interface, the GestureInterface!

What it does

GestureInterface allows for user interface with computers using different hand gestures, currently finger gestures. Once the GestureInterface is running, the user can move their hand into the view of the camera. Once the user has gotten their mouse point (Average point of the fingertips) into a location they like, they can zero the mouse to that position. Once they move their hand away from that point, GestureInterface will move the mouse around according to the movements of the hand. Depending on the number of fingers the user retracts, 1, 2, or 3, GestureInterface will click the left, middle, or right mouse button for the user, so the user doesn't have to!

How we built it

We initially began with trying to find the hand from the video feed, so we attempted some background differentiation and calibrated colour filtering. However that ended up taking too much time, so we ended up using a set colour filter to find the hand from the background. Then, looked for advice and got that simplifying the shape of the hand will simplify our program, so we turned the hand into a convex hull. Then we started drawing lines and points on the hand so we can visual what the program sees, and we started just playing around with the program. After some time, we saw some patterns that occur while extending and retracting fingers, so we developed and algorithm to look for those patterns and find fingers. We then just added the mouse interface and mapped the controls.

Challenges we ran into

Two large challenges that we ran into were the detection of the hand and interpretation of the data. Because of the room we were in, we wanted to have a very robust detection system so that the varying sun level shining through the windows. However that ended up being a distraction, because we realized that we could spend many hackathons optimizing the hand detector. So we just decided to keep it simple and just use a colour filter. The second challenge was the interpretation of our data. We had a basic idea that we could somehow use the spaces in between the fingers to help identify them. However, after substantial development, we realized that we had no way to pinpoint those points, so we had to completely revamp the way we interpret our data. We ended up just playing around with lines and circles on different parts of the hand until we could find a pattern that allowed us to consistently get a finger detection for a proof-of-concept.

What we learned

We learned about OpenCV, its many functions that allow for easier computer vision, and a lot about how computer vision works in general, how computers really see and what they can do to interpret the world around them using cameras

Accomplishments that we're proud of

We're proud that we were able to get GestureInterface working enough for a demo, because of the many obstacles that we encountered we were doubtful that we would complete something in time. But alas, we were able to hack GestureInterface for Hack the North! But we are mostly proud of our fingertip detection, because we ended up doing it completely blind. Because of this, we felt that this part was something that we truely made, because of the special thought and research we did with our detection system

What's next for GestureInterface

There are three next steps for GestureInterface that we can see. First, a more robust detection system with background differentiation and a calibratable colour filter, which we wanted to do earlier. Second, would be implement an app that would allow the user to assign their own macros to the interface. Third, we would like to implement a CNN to detect full gestures instead of simple fingers. Because of the video stream, we would be able to extract enough data for many different gestures for a CNN to become viable. This would also allow GestureInterface to interpret many more gestures for the user to use.

Share this project:

Updates