Inspiration
We were all interested in emotional analysis and we felt that this was really cool and effective tool for anyone to use. This tool can be used by anyone who is trying to make a better presentation, be that a student, public-speaker, or actor.
What it does
We have built two separate but related tools that can work together to help people make the most compelling presentation possible. For someone who is trying to write a document of some sort, we have the Live Sentiment Analysis tool. This is a web app where someone can edit their work and hone in on a targeted emotional impact. The Sentiment Analysis tool uses Watson NLP API to get document-level and sentence/clause level analysis of the emotional content of the text. We provide regular feedback and updates on the overall and more specific emotional content of a document, as well as how your edits are changing that emotional content. The second tool is used to help people master the audio portion of the presentation. Anyone who wants to use the tool can record an audio file and upload it. We use Google Voice API to extract the text data from this recording. Then, we send this text data to the Watson API to perform sentiment analysis on top of each clause in the presentation. We also analyze audio data from the mp3 file with the DeepAffects model which recognizes the emotional content of speech without incorporating information about what the words being spoken are. Then we compare the clause-level emotional tags from the text and the audio data to see whether the person is really able to match his voice to his words and captivate his audience.
How we built it
Challenges we ran into
We had some challenges integrating APIs and integrating the frontend and backend. Other big issues were making the product as effective as possible. For example, we worked on the scoring function so that it would provide good results. The function combined data from the text-analysis and the audio analysis and we had to determine a way of combining this data in a way that reasonably represented the quality of someone's speech.
Another minor challenge was choosing an effective way to get the clauses. For the real-time text analysis, we had to get the clauses so that they had enough words that they would represent meaningful emotional data, but they were small enough that the user could get frequent feedback.
Accomplishments that we're proud of
We ran audio analysis using a deep neural network. The training data is really hard to find, and we built upon previous models like DeepAffects. Extending the classic neutral, positive and negative tagging, we are proud of the emotions and scores based on a complex algorithm that our application predicts from your voice and maps it to the actual sentiment using IBM Watson.
What we learned
We understood that there is not much publically available data for audio training, hence it was important to build upon previously trained models and use RESTful APIs wherever possible. At the same time, we learnt microservices, commuication between client and server, using jquery and javascript (as back-end engineers).
What's next for Hearo
We want to keep building the product that we have started at CalHacks 6.0. Newer features could be live emotion tagging from the microphone, reducing latency, improving accuracy of the models used.
Log in or sign up for Devpost to join the conversation.