GAT: Gestural Artificial-Intelligence Technology

Inspiration

The implementation of Artificial intelligence in our daily lives is a trend that is growing significantly. Artificial intelligence technologies that can recognize and perform actions based on human movements are becoming more common. In the aftermath of a Global Pandemic, communication using technology is on the rise. The state of affairs in Artificial intelligence and human communication using technology inspired us to create an Artificial intelligence model that can identify human movements, and interact with a software interface based on the specific movements.

What it does

GAT: Gestural Artificial-intelligence Technology is a collection of 2 AI models that can recognize human movements using a camera and perform specific actions based on specific human movements. It is divided into two models, one that is completely optimized in the browser, and one that can be run on your computer as an app. With these models, developers can map hand gestures to specific actions.

First Model (Website Optimized): Our First Model is an AI model that can be built into any website that recognizes human gestures using a camera and performs actions based on these gestures. We built the UI interface of a social media app, and assigned hand gestures to implement scrolling, mouse pointer movement, liking posts and automated typing of certain words.

Second Model (Computer System Optimized): Our Second Model is an AI model that runs on a computer system and recognizes human gestures using a camera and performs actions based on these actions. We trained our model to recognize hand gestures and use them to control the mouse pointer, click action, automated typing of certain words, backspace key, and the return/enter key. We built a messaging app to demonstrate the web experience associated with this model.

How we built it

Our First Model(Website Optimized): For our first model, we used the TensorFlow js hand library to track the hand movement of a user. We also used a camera library to track the hand itself. Using the different points associated with the user's hands, we created gesture functionality on our mock HTML webpages. These gestures include moving the mouse and webpage using your hands, and in our example, we liked images on a social media platform using gestures.

Our Second Model(Computer System Optimized): Our second Model was made using python. the Open-Cv python library was used to access the camera of the hosting device. The python Mediapipe library was used to enable hand detection by our AI. Using the Mediapipe Mediapipe library, we were able to create an AI module that recognizes hands, identifies the fingers up, and detects the distance between fingers. Using these methods, we were able to train our AI to associate different hand gestures with different operations. To effectively demonstrate the experience our AI provides, we created a messaging app and made it compatible with our model.

Challenges we ran into

The first challenge our team ran into was learning how to work with our AI libraries. As a team we were all inexperienced in the field of AI, so for the first few days of working on our project, we decided to do our research to find out what we were capable of. This is when we found out about tensorflow.js and Mediapipe. Two libraries would allow us to track the user's hands and give us information about them. The second challenge our team ran into was juggling our time. Since our team knew very little about AI going into this project we had to spend our first few days researching to get a full understanding of our libraries. But we persevered by sticking to our goals and having clear communication as a team.

Accomplishments that we're proud of

One of the accomplishments that our team is most proud of is our process to learn how to get the tensorflow.js and media pipe AI working with no prior knowledge. And then being able to use these different AI libraries to enhance user experience.

We are also proud of the mock social media website and gestural AI that adds functionality to our website. We are very impressed with the progress we were able to make and the functionality that we were able to add using hand gestures.

What we learned

The development of GAT was an immersive learning process. Before the GAT development project, we had never worked with artificial intelligence libraries. We learned how to use numerous technologies. The team learned about machine learning and artificial intelligence. We learned how to use the TensorFlow js library to track human body parts. We learned how to use the TensorFlow js library to interact with HTML pages. During the development of our computer-optimized model, we learned how to use the Mediapipe and OpenCV python modules to track human hands and train an AI to recognize hand gestures. Working in a team to develop the GAT model, taught us how to work in a team. We faced many challenges and learned how to help each other overcome these challenges. We learned how to set a time limit and follow that time limit. Working on the GAT model was an amazing time and every member of the team experienced a significant amount of growth in knowledge, and character.

What's next for GAT

The future for GAT is filled with endless possibilities. We plan to further develop the GAT model to recognize hand gestures representing the American Sign Language figures. We plan to integrate this into an American sign language translator and learning app. This will ease the communication between people with disabilities and teach people how to communicate using this medium. In addition, we plan to create a GAT model that allows users to add their gestures to their version of the GAT and map these gestures to operations. Lastly, we plan to create a web extension version of the GAT model. This will allow users to experience navigating their web browsers with hand gestures. We are certain that the GAT model will evolve into something beautiful. Infinity is the limit for the future of GAT.

Built With

javascript
mediapipe
opencv
python
react
tensorflow.js

Submitted to

Philly Codefest 2022, Presented by Comcast Cable
- Winner Philly Codefest – Freshmen Challenge
- Winner Comcast Cable Challenge - Build a Gestural Interface for Web Design