Inspiration
All four of us play instruments, and we wanted to merge our passion for programming and computer science with music. Our original idea was to do automatic song transcription to guitar tab, but we decided it would be more fun to make String Protagonist.
What it does
In String Protagonist, you can pick the song that you want to play, and all you need is a microphone and a quiet room. Notes will fly down towards you, and you simply have to hit the right notes at the right times. It's like practicing with a band without needing an actual band playing with you!
How we built it
We broke the project up into two major components that could be tackled in parallel: the game and the pitch-detection.
The game was coded in JS with React.js and MUI for UI components. It uses tabs scraped from the website songsterr.com, an online repository of guitar tabs which maps the notes in the tabs to their timestamps in the song. The main difficulties here were processing this data and building an engaging user interface.
For the pitch-detection, we knew that it would need to carry out complex computations in real time, so we chose to write it in Rust, to run in Webassembly. Our first thought was to detect fundamental frequencies by taking the loudest frequencies in the FFT of the waveform, but we quickly realized that this wouldn't be sufficient. This is because the loudest frequency is not always one that is being played intentionally (for example, when playing the lowest E string on my guitar, the loudest frequency is an overtone, which sounds like a B).
In order to resolve this we decided to use a neural network to extract the pitches from the frequencies. We found a dataset, called GuitarSet, of transcribed guitar recordings, and trained an MLP on the results of the FFT. We adjusted both the architecture and the hyperparameters of the MLP in order to achieve maximum loss convergence on the training set and testing set performance. Some of the things that saw noticeable results included changing ReLU activation to LeakyReLU to help with vanishing gradients, increasing both depth and width of the network, and adding it learning rate schedulers in order to properly reach loss minimums. For each sample, we also compressed note results to only one octave, as we decided that for the purposes of a guitar game like String Protagonist, it was unlikely that a player would play a note in the wrong octave, so all we really care about is the actual note being played. All in all our best model had an average test loss of around 0.15, mathematically this means that each test note had a e^-0.15 chance of being correct, which is around 90%!
Challenges we ran into
Shiyan: The main challenge that I ran into was building and training the neural network. I had to think about what changes would positively affect the accuracy of the classification, and efficiently implement them in our model.
Alex: Detecting multiple pitches from a waveform is harder than we expected it to be. In the past I've tried to avoid solving problems with AI (hence our team name) because I like classical algorithms. But after reading a couple of papers we realized that it was by far our best option.
Daniel: Being able to accurately parse the data we get from songsterr.com and transform that data into something we can accurately map to our UI and background track was the crux of the problem. Then, create a game loop that will be able to create smooth animations while maintaining accuracy when a note should be played. Considering that the JS runtime's timings can vary, we need to be able to adjust for those time dilations.
Louis: I had to quickly learn how to use the react library which I had never heard of before and adapt myself to it. There has been a lot of debugging, internet researches, help needed, etc. but everytime, with enough perseverance, I could always find my way through.
Accomplishments that we're proud of
Creating a functional game and user interface. Most of us also did not believe that we'd be able to get a fully-working ML model in 2 days time.
What we learned
How to develop under time pressure, as well as managing work based on each of our strengths and prior experiences.
What's next for String Protagonist
Improve the neural network. Right now there are some latency issues that inherent due to the processing that the network must first perform, so that would be the main goal going forward.

Log in or sign up for Devpost to join the conversation.