This is repo won second place at McGill AI Hackaton 2025
This project was the combination of our love for music and building something related to computer vision, so we took the best of both worlds and decided to make a project using Spotify.
Hand gestures on camera to control your Spotify
Using the Visual Studio Code IDE
Visualizer was hard to do, merging all of our files was hard, and finding data sets took a long time to ensure our model was accurate. We deliberated on which model to use, comparing and contrasting each one. We attempted to train a model, but it took excessively long, so we decided to use an existing model and modify the inputs. The challenge we ran into again with this existing model is that it had limitations with the number of gestures, so we had to train it more by gathering more data so it can recognize all the gestures we wanted the app to.
Getting the visualizer done, designing the UI, getting the camera to work and called Spotify functions. We seamlessly integrated AI with the Spotify AI to get the desired result and make the best experience for our future users.
Spotify API, Python TkInter library, OpenCV, AI stuff, PyTorch, Windows SDK, Numpy, PyAudio We also learned to interact with hardware by accessing the sound card and webcam of the computer
Collecting our own data and fully doing the training ourselves from start to end and deliver the app in the form of a website or installer
Demo: https://drive.google.com/file/d/1Tk_j2XhFdSQyq7u8ZkqVTY8g6KN0smni/view Thank you to dynamic_gestures for the models CNN models: https://github.com/ai-forever/dynamic_gestures/tree/main/models
Thank you all for collaborating on the project and we're all very excited to present it!