Inspiration
Have you ever met someone, but forgot their name right afterwards?
Our inspiration for INFU comes from our own struggles to remember every detail of every conversation. We all deal with moments of embarrassment or disconnection when failing to remember someone’s name or details of past conversations. We know these challenges are not unique to us, but actually common across various social and professional settings. INFU was born to bridge the gap between our human limitations and the potential for enhanced interpersonal connections—ensuring no details or interactions are lost to memory again.
What it does
By attaching a camera and microphone to a user, we can record different conversations with people by transcribing the audio and categorizing using facial recognition. With this, we can upload these details onto a database and have it summarised by an AI and displayed on our website and custom wrist wearable.
How we built it
There are three main parts to the project. The first part is the hardware which includes all the wearable components. The second part includes face recognition and speech-to-text processing that receives camera and microphone input from the user's iPhone. The third part is storing, modifying, and retrieving data of people's faces, names, and conversations from our database.
The hardware comprises an ESP-32, an OLED screen, and two wires that act as touch buttons. These touch buttons act as record and stop recording buttons which turn on and off the face recognition and microphone. Data is sent wirelessly via Bluetooth to the laptop which processes the face recognition and speech data. Once a person's name and your conversation with them are extracted from the current data or prior data from the database, the laptop sends that data to the wearable and displays it using the OLED screen.
The laptop acts as the control center. It runs a backend Python script that takes in data from the wearable via Bluetooth and iPhone via WiFi. The Python Face Recognition library then detects the speaker's face and takes a picture. Speech data is subsequently extracted from the microphone using the Google Cloud Speech to Text API which is then parsed through the OpenAI API, allowing us to obtain the person's name and the discussion the user had with that person. This data gets sent to the wearable and the cloud database along with a picture of the person's face labeled with their name. Therefore, if the user meets the person again, their name and last conversation summary can be extracted from the database and displayed on the wearable for the user to see.
Accomplishments that we're proud of
- Creating an end product with a complex tech stack despite various setbacks
- Having a working demo
- Organizing and working efficiently as a team to complete this project over the weekend
- Combining and integrating hardware, software, and AI into a project
What's next for Infu
- Further optimizing our hardware
- Develop our own ML model to enhance speech-to-text accuracy to account for different accents, speech mannerisms, languages
- Integrate more advanced NLP techniques to refine conversational transcripts
- Improve user experience by employing personalization and privacy features





Log in or sign up for Devpost to join the conversation.