Inspiration
There are many people with medical conditions such as muteness, ALS etc., who are unable to speak with a natural human voice. Machine-synthesized speech is a potential solution, but it often sounds unnatural and robotic, with strange intonations.
This presents an opportunity for a technology that would enable such patients to freely engage in verbal communications.
What it does
It processes and manipulates the waveform of the input audio signal to sound like the target person's voice
How I built it
Processed audio signals, and used a deep neural network to map the input signal to the target signal.
Challenges I ran into
Modifying speech signals is an incredibly challenging task. Features that were selected had to contain all information to regenerate an audio signal from the features. High-dimensional space for machine learning.
Accomplishments that I'm proud of
This is a proof of concept that unveils what could be possible in the future in the field of speech synthesis, text-to-speech etc.
What's next for Voice Mapping
Experiment with adveserial nets - use a combination of a discriminative net and a generative net, and gather more training data.
Log in or sign up for Devpost to join the conversation.