Verbalyst | Devpost

Record yourself on website and upload (mp4)
AI Analysis and Response
Generated Charts
Tongue Twisters
Top of Main Page
Information about us
The Inspiration behind Verbalyst

Inspiration

When the quarantine was introduced, many individuals, including us, were starved of social interactions. It was difficult transitioning from text-only interactions to fully-fledged conversations after the pandemic, let alone public speeches and presentations. Although enough time has passed to mend our abilities to speak to others and have fruitful conversations, speeches in front of large groups still stand as challenges that we'd rather not face. Knowing that the prevalence of social anxiety is increasing drastically in our society, we decided to tackle the issue that we and many others face with our app, Verbalyst.

What it does

Verbalyst allows users to rehearse and record their speeches with both audio and camera an unlimited amount of times. It analyzes the recording to determine how many stutters the user has stated in the duration of their speech. This project encourages more practice and familiarity with speaking, aiding those of all ages as speaking skills are universal; they're needed in the industry, in classrooms, in friend groups, you name it. Users can track their speaking skill progress through the STATS page and see their growth right in front of their eyes, and for individuals who are already confident in their speaking capabilities, Verbalyst offers a fun challenge in the form of a tongue twister generator. Individuals can record their tongue twisters and catch themselves saying filler words and other stutters, which can help eliminate poor habits.

How we built it

We used TailwindCSS, vanilla HTML and vanilla CSS to create all of the website pages. We used javascript for our recording software and used Flask to connect our front end to our back end. Our back end was created through Python and we used the vertexai API to have an AI-generated analysis of the speech that the user inputted, and we used the assemblyai API to have an AI do speech to text on the inputted mp4 file from the user.

Challenges we ran into

Starting 26 hours (8 pm EST) behind due to scheduling issues and personal conflicts posed a great challenge. Despite this major setback, we adapted swiftly, optimizing our time to make up for lost hours.

Although the end product of our code had been less than we anticipated, we were able to have both of our AI components working individually through our console. It was creating a cohesive web application with proper functionality within our severe time crunch that proved to be the problem. Despite this, we still tried our best to implement all of the components we envisioned yesterday, and we believe that we did the best we could with our time.

Accomplishments that we're proud of

Given our aforementioned challenges, we're proud that we have an aesthetically pleasing homepage with a functioning recording feature that allows the user to view their recording live and download their recording upon pressing the kebab button. The recording feature also lets the user switch tabs and still be able to see themselves with the Picture in picture button so that they can switch tabs to read their script if required. We're also proud we were able to get the Google Vertex AI part to recognize stutters in a speech and proud that we were able to get the speech-to-text portion from AssemblyAI to work alone.

What we learned

The entire project was essentially a learning journey for us; we learned so many things for it to have the functionality it has now. For starters, creating the recording feature and being able to receive and store user audio input so that it could be used for the AssemblyAI speech-to-text transcription model was completely new. Although we were unable to put the user audio input and the AssemblyAI speech-to-text model together with Flask, we still learned how to use AssemblyAI and convert premade audio input into text. Even the markup framework we utilized for this project was new (TailwindCSS), so it's safe to say we put a lot of time and effort into learning new things.

What's next for Verbalyst

We had quite an ambitious plan for our project, but we were unable to reach this goal because of our 26-hour delay. Because of this, we actually have a plan on what to do even after the hackathon ends. We will fix the flask that we were close to fixing to have our website work as intended. On top of this, there were limitations to the speech analysis that we could do. We were able to determine repetition or stutters through purely text-to-speech, but we were not able to find an AI that could identify lisps, speaking too fast/too slow, or other speech impediments. In the future, we will try to find an API or hopefully even create an AI that will be able to do this. On top of this, some portions of our UI/UX are inconsistent, so we plan to fix this.

Built With

assembly-ai
css
flask
google-vertex-ai
html
javascript
python
tailwind

Submitted to

Ignition Hacks 2023
- Winner Best Web

Created by

I worked mostly on the front end, using HTML, Tailwind CSS, JavaScript, and Chart.js. I also recorded the pitch video and did research on speech impediments.

Sai Jilla
UWaterloo CS'28
Freddy Chen
UofT CS'28
Enoch He
UWaterloo CS'28

Updates

Freddy Chen started this project — Aug 27, 2023 11:44 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.