Inspiration
These days, there are millions of tools to help you code that slowly rot your brain away. But what about for unleashing your creative side? Sometimes, you just get that stroke of inspiration and suddenly know how to make a banger song. But wait, you don't, because harmony is a four semester class and ain't nobody got time for that. So the solution? HarmonAI, a tool that automatically harmonizes your melody to your liking, even providing the midi files so you can drag them around in Ableton before your free trial runs out. The path to becoming the next Taylor Swift (or the equivalent person in your favorite genre) is finally here!
What it does
With HarmonAI, you can sing or hum a melody, and also provide a text prompt for how you want it to be harmonized. Then, with the click of a button, it generates you an entire song, complete with a complex chord progression, block form with a chorus, multiple instruments, and a drum pattern. Best of all, it gives you the midi for all of these, so you can adjust them how you want later on.
How I built it / Challenges
First, an input audio is turned into midi with Spotify's Open Mic. Then, the hard part is harmonizing it. I built the harmonization process on AccoMontage2, an open source software which can generate a chord progression for a given single voice melody, perfect for our use case. However, it had poor support for taking in input of various sizes, rather, you need to specify the exact length in terms of phrases, which had to be a multiple of 4 measures each, and then the amount of 16th notes had to be exactly correct. To account for this, I would extend or truncate the melody to fit these requirements, also using external queues to decide the phrasing exactly. Then, I was able to generate a chord progression from the melody (several different progressions depending on melody). However, this software was made to generate a piano accompaniment, which isn't suitable for a fully fledged song with multiple voices on different instruments. To account for this, I had to modify how the program generates textures. By default, the texture generation doesn't even work since the repository is missing binaries which are also missing from the linked drive folder, so I had to make do with a version from another version of the project and modify the code to accept it (specifically an edge weights file). Then, by generating multiple textures from the same chord progression with different preset parameters each, I could generate voices for different instruments. This was all fine and dandy, and I had to change some logistics to make each single track midi files for each instrument. Finally, we can generate the drums with Mangenta's Drumify. So in the end, we are left with midi tracks for the altered melody, accompaniments, and drum.
To put everything together, I build a frontend website which would help users record audio and also take in a text prompt. The text prompt is then fed into gemini to determine the exact input parameters for the backend model (style, phrasing, etc). I just made a very simple Gemini wrapper api to handle this. With this, the backend model from above is called and generates the links to the relevant midi files, which can be displayed on the website (inlined so they don't need to be downloaded to be played). The user can choose to play the files together, play them individually, or download them. And that's about it!
Accomplishments that we're proud of
I am proud of the fact that the thing actually works pretty well and can actually generate full length songs and fairly complex chord progressions. Overall, everything also sounds good and coherent, except maybe the drums at times. So I think I achieved my end goal here pretty well.
What's next for HarmonAI
Next, I will deploy the website and everyone will start using it and ill charge a million dollars per person and ill get hella rich
Built With
- flask
- javascript
- midi.js
- python
- react
- tensorflow

Log in or sign up for Devpost to join the conversation.