Beatlearner is a beatsaber map generator, conditioned on natural language. Given an mp3 file and a verbal description of what you want in the map, Beatlearner will provide a map that is designed to fit your needs. Our needs, that is.
This was easily one of the most ambitious projects that we have ever attempted at a hackathon. We needed to handle not just one massive model, but 3, in addition to training our own. It was well worth it in the end--all of us thoroughly enjoy both Beat Saber and ML, and combining the two in this way really works.
On the language end, we used OpenAI's GPT3 to generate prompts given the very limited information, tags, and descriptions that we have available to us. This returned some great natural language results which we were able to feed into BERT to give us embeddings.
On the Audio end, we used an audio encoder from OpenAI's Jukebox.
We used an autoregressive transformer with a custom-built ConditionedSparseAttention mechanism to generate beatblocks while taking into account both the music of the moment and the BERT.
There is a lot left to do! The model has a lot of work to improve and indeed we never really had the compute resources and time to finish training it.
Log in or sign up for Devpost to join the conversation.