Inspiration

Blind karaoke is a recent trend that has popped up within the last few years. A random song is put up for karaoke, the singer must face away from the screen, and the audience watches as said singer struggles and creates a new set of original lyrics. But why hasn’t this been officially turned into a game? What if you could play without having to pick the song yourself? If you could get a score and remember what exactly you sang without having to record a video? Our project solves this problem. The goal of the app is to help boost confidence in singing creatively, create a fun community anywhere, and develop a knack for brain resilience through our favourite songs and playlists.

What it does

Blind Karaoke is a web application that allows you to keep track of the lyrics you get wrong while playing this game, using a speech to text function while comparing to the original lyrics of the song. The game is suitable for anyone, from casual players to even professionals who may want to use the app as a guide to internalizing lyrics in a fun manner that is also multi-player.

How we built it

Frontend: We utilized Figma to prototype our app design, coordinating UX and UI workflows. Select design elements were created using Procreate.

Backend: We began by laying out the basic idea for our code: a transcriber to convert speech to text, a database to pull out karaoke songs, a comparison file to evaluate transcribed text and actual lyrics, and finally a main file to run our project’s backend.

For our transcriber, we used OpenAI’s whisper model to convert audio to string literals. We then defined a .json database with various songs, including genres, artists, and other relevant details. We created a comparison file that compared our lyrics (stored in the database) with the user’s transcribed voice, outputting an evaluation. Finally, in the main file, we called all of these sub-files to create a “game instance” that had more advanced details, including accuracy, errors, and more.

Transcriber: We decided to use OpenAI’s whisper model to pick up audio, and then convert said data into string format. We added options in order to allow for users to stop recording early, pause, and exit whenever they wished to do so. We utilized several while loops, string literals, and if/else statements in the process, checking off various boxes whenever the user made a conscious decision.

Database: We created a json dataset with song information, detailing genres, artists, titles, and year of origin. Then, we linked this dataset to a function in our database.py file, where we created several other functions allowing us to search for songs, add songs, and filter songs by aforementioned details. We also created a searching algorithm that would search for the most relevant song that the user selected.

Comparison: We began by converting the transcribed text to a string format consistent with our lyrics defined in our dataset (lowercase with appropriate spacing). Then, we parsed through both strings and compared the transcribed strings’ accuracy with the actual lyrics. Based on how many characters were identical, we assigned letter grades and performance values to display to the user once they ended their session.

Main: The main file calls all of the other sub-files, and creates a “game session”, initializing the user and allowing them to choose from several functions: song speed, song selection, competition, etc.

Challenges we ran into

We had difficulties when it came to integrating the frontend and backend of our project. Our original backend was created from Figma, but due to issues with full-stack integration, we turned towards a simpler frontend design that was more compatible with our Python scripts.

Accomplishments that we're proud of

We’re proud of the fact that we were able to incorporate the Spotify web API into our programming, pulling songs from their library through user inputs and a searching algorithm. We also created a full-stack application with a working backend and a stable frontend, and a fully prototyped design on Figma.

What we learned

We learned that it would be wise to consider frontend/backend integration from the start, rather than tackling the two separately and facing difficulties later throughout project time. This would have allowed us to proceed together at the start. Then, once solving the issue of integration, we would be able to split up and work at our own pace. We also explored the various ways of utilizing speech-to-text models, and gained proficiency in working with full-stack processes.

What's next for Blind Karaoke

In the future, we would love to expand this project from a web application into potentially a mobile app, where users can have fun on the go. We’d like to develop personalization for users, which would include favorite songs, all-time stats, and some more custom features.

Built With

Share this project:

Updates