Inspiration
We noticed that many people feel anxious about starting therapy or don't know where to get started. We wanted to create something that could naturally converse with the user in a comfortable way that could ease them into therapy.
What it does
The Therabox is capable of having a natural conversation in a therapeutic setting; this means it requires no typing and will specifically respond to your concerns brought up. It works best in a Q&A session, and you can converse with it as you would a friend.
How we built it
Our project has three main components: GPT-3, Assembly.Ai, and Google Cloud. When a user speaks to Therabox, it will detect a question and it will then keeps listening until the user finishes talking about their concerns/questions. First, we use Assembly.Ai to turn that voice input into text. Then we take that question and feed it into a GPT-3 model that we've fine-tuned to get a therapist-esque response. During the process, the question and response are saved into a chat log file so GPT-3 will remember prior conversations. Lastly, after saving the chat log, we use Google Cloud text-to-speech to make Therabox "talk" out loud with the user. This back-and-forth is repeated indefinitely until the user decides to end the conversation (or we run out of API credits).
Challenges we ran into
While building Therabox we learned just how challenging it can be to integrate with some APIs. The Google Cloud API values security incredibly highly so it took a lot of effort to properly authenticate and integrate the service into our script. But once everything was set up, the integration was solid.
Interacting with the microphone and speaker connected to the Raspberry Pi also proved to be challenging. The raw data received from microphones is tough to process, so we spend a great deal of time converting it to a standard format.
Accomplishments that we're proud of
Since GPT-3 is a general language synthesis AI, it can generate any sort of text in any context, and from any perspective. Getting its parameters just right to reliably produce realistic and helpful therapist-like responses was challenging but we’re very proud of what we achieved. Our script almost always nails the context provided by the user and gives helpful, creative advice.
Coming into the project we were concerned we wouldn’t be able to achieve a natural back-and-forth conversation with our bot since we didn’t know if we would be able to reliably detect when the user begins and finishes speaking. However, we managed with an algorithm that works with AssemblyAI’s stream endpoint to very reliably detect the beginning and end of speech without too much delay. We’re very proud of the result.
What we learned
We learned more about working with a partially trained model when using GPT-3. Although it's already trained as a natural language processing model, we needed to add additional data to make it specifically acclimated to therapeutic conversations. Besides working with the model, we gained experience with both text-to-voice and voice-to-text translations. After completing the basic program, we ended up learning about the hardware limitations of different devices. Since we wanted the final product to run independently, we needed the software to match the Rasp-Pi hardware limitations.
What's next for Therabox
We have big plans for Therabox in the future.
First of all, we believe that we can improve how "coherent" the responses from GPT-3 are. One of the ways we hope to improve this model in the future is to use Part of Speech Tagging (POS Tagging), which is the process of separating a sentence, attaching a tag to each word (such as articles, nouns, adjectives, etc.) and use do an analysis on the generated tags to do these following tasks:
Sentiment analysis: try to gauge the user's emotions to build a better prompt for GPT-3 that's more customizable to the nuance of conversation
Correcting input and output: since humans might not always speak in the correct grammar, correct it. GPT-3 outputs weird stuff sometimes, correct it, and help generate better output for the user.
One other way that would dramatically improve Therabox is to have therapist recommendations. Therabox is in no way a replacement for a therapist, and we want to make sure we can provide our users with the help that they need. It would be amazing to have access to a database or some sites that have information on nearby therapists.
Built With
- assemblyai
- google-cloud-text-to-speech
- gpt-3
- openai
- python
- raspberry-pi
Log in or sign up for Devpost to join the conversation.