A SoulVerse | Devpost

Inspiration

Our inspiration stems from the inherent human desire to maintain bonds that extend beyond the limitations of mortality. We have witnessed countless individuals yearning for one more conversation, one more chance to express their love, seek guidance, or simply share precious moments with those who have left this world. We are inspired by the belief that death should not mark the end of a relationship, but rather a new phase where the power of technology can offer solace, healing, and an ongoing connection. Our mission is to provide a meaningful avenue for preserving and nurturing these bonds, creating a lasting legacy of connection that traverses the realms of the physical and the ethereal.

What it does

Digital Funeral Service:

We create a heartfelt video, preserving the departed individual's final words through cherished photographs, videos, and voice recordings graciously shared by their family. This special tribute is showcased at the funeral, offering an opportunity for the family to personally connect with the Al-generated representation.

Interactions Beyond death with Departed Beloved Ones:

Users can experience real-time voice calls with their departed loved ones, while facial recognition technology reconstructs the appearance of the departed, providing a meaningful and interactive connection

How we built it

The technology stack employed for our web application includes JavaScript, CSS, and HTML on the front end, along with Flask and Python on the back end. Additional technologies we've utilized encompass HumeAI, OpenAI, Langchain, gTTS, SpeechRecognition, and PyDub.
Our application encompasses an interactive chat box, designed to accept both text and audio input. Audio inputs are transcribed into text format through the SpeechRecognition API, and all data is then dispatched to the server through JavaScript.
Upon receiving the data, our Flask server creates a local text file. This file is then transmitted to the HumeAI API via an asynchronous batch request, which is facilitated through a repeated call - a necessity due to Flask's inherent synchronous nature. This operation generates a primary key, which we subsequently use to retrieve emotion scores from the HumeAI API. These emotion scores serve two key functions in our system. Firstly, they help train the language model's prompt to produce more emotionally expressive responses in the long-term context. Secondly, they are directly utilized as part of the input in the OpenAI API, shaping the short-term responses to our users. The GPT-4 model's output is subsequently converted into an audio file using Google's Text-to-Speech library, resulting in an AI-produced audio response. This audio file, along with the corresponding text, is displayed in the chatbox simultaneously, enriching user engagement and experience.

Challenges we ran into

The AI response speed is slow, impacting the interactive experience negatively.
Poor stability and slow speed of generating dynamic portrait images
Output from prompts is very unsteady ## Accomplishments that we're proud of By using prompts and leveraging Hume.ai, we achieved stable character design through iterative testing. Hume.ai, a powerful emotion recognition interface for text, pictures, and videos, seamlessly integrates into our digital human interaction, enabling optimized user feedback based on emotional intensity. We improved the realism of dynamic face image generation by exploring and implementing facial expression-driven APIs. By adapting the source code to incorporate voice-driven expression changes, we selected the most suitable interface input for our specific scenario. To address the real-time voice transmission speed, we extensively evaluated and selected the most suitable voice API from a range of options. Despite the availability of multiple speech synthesis APIs, we focused on achieving optimal real-time performance by testing and selecting the most suitable solution. ## What we learned Technology selection and evaluation: Learn to assess and choose the most suitable technology stack based on factors like maturity, scalability, security, and applicability. Iterative development and continuous integration: Embrace iterative methods and continuous integration to enable quick iteration, updates, and efficient response to changes, ensuring high-quality project delivery. Teamwork and communication: Foster effective communication and collaboration among team members, promoting a clear understanding of roles, tasks, and shared information flow. ## What's next for SoulVerse We recognize the importance of Hume.ai's support in obtaining emotional feedback from users and capturing resonant information. To enhance the Hume.ai interface, we aim to optimize it further, leveraging its compatibility with video and voice input. To ensure a stable and efficient dialogue, we aim to prioritize the establishment of a reliable connection between the persona Prompt and the embedding document. Preparations are underway for the server-side infrastructure and data flow management to expand our market, ensuring smooth user handover with utmost comfort and accuracy.

Built With

css
flask
gtts
html
hume
javascript
langchain
openai
python
react

Submitted to

UC Berkeley AI Hackathon

Created by

My responsibilities encompassed back-end development. I utilized Flask Python and JavaScript to facilitate communication between the front-end and the server, as well as integrating both OpenAI and Hume APIs. Notably, this was my inaugural experience with Hume. Its application proved to be quite beneficial for our project as it endowed the Large Language Model (LLM) prompts with a heightened emotional resonance.

ZT Luo
I work on frontend

Xiangwei Kong
Jiankun Sun
Zhengnan Ma

Updates

Jiankun Sun started this project — Jun 18, 2023 04:54 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.