Inspiration

There are many people with medical conditions such as muteness, ALS etc., who are unable to speak with a natural human voice. Machine-synthesized speech is a potential solution, but it often sounds unnatural and robotic, with strange intonations.

This presents an opportunity for a technology that would enable such patients to freely engage in verbal communications.

What it does

It processes and manipulates the waveform of the input audio signal to sound like the target person's voice

How I built it

Processed audio signals, and used a deep neural network to map the input signal to the target signal.

Challenges I ran into

Modifying speech signals is an incredibly challenging task. Features that were selected had to contain all information to regenerate an audio signal from the features. High-dimensional space for machine learning.

Accomplishments that I'm proud of

This is a proof of concept that unveils what could be possible in the future in the field of speech synthesis, text-to-speech etc.

What's next for Voice Mapping

Experiment with adveserial nets - use a combination of a discriminative net and a generative net, and gather more training data.

Built With

Share this project:

Updates