Haptix | Devpost

You can see the recording options on the left
An example automation workflow

Inspiration

We wanted to provide an aid for neurodivergent individuals to more easily navigate social interactions that often rely on unspoken and implicitly shared cues. Because these signals are rarely made explicit, they can be easy to miss or interpret differently. This gap between implicit social norms and individual processing styles is what we aim to bridge.

What it does

The Haptix app empowers users to learn and react to completely personalized situations, through the use of haptic and auditory feedback. Users can utilize our advanced automation platform to specify situations and conditions where they might benefit from the help of an informational, cautionary, or other signal. During an active audio session, Haptix automatically detects if any of these conditions have been met, and sends the chosen signal to notify the user through our sleek wearable haptic device.

For example, if the user struggles to realize when people are being sarcastic, they can set up an automation to receive two short vibrations whenever the person they are in conversation with is being sarcastic. They will receive this signal live, as the Haptix session detects that the condition is met.

How we built it

The product consists of two major user interfaces. The first one is our Web App built with React, from which automations are created and sessions are recorded. The second major component is our haptic wristband, which is able to produce vibrations and noises of varying intensity. These two communicate through HTTP Requests that go to our custom FastAPI backend. The backend streams the live audio feed from the frontend to ElevenLabs, which produces a highly precise, realtime transcript of what is being said. This transcript is then fed into Google Gemini's API, which we programmed to execute function calls whenever conditions are satisfied. These function calls interact with the API to make the wristband execute its actions.

Challenges we ran into

One main challenge we ran into was the Gemini API. We initially weren't planning on creating a live transcript and feeding the text to Gemini -- we intended to use Gemini's multimodal design to feed it the audio and receive text signals back. Unfortunately, doing this live has a bug in the Gemini library that was reported to Google many months back but has not been fixed yet. This meant we had to pivot, which is what led to the use of ElevenLabs (and then Gemini with the transcript).

Accomplishments that we're proud of

We are extremely proud of the real time processing and outputs, as it feels very natural and intuitive to use. We are also very excited about our automation editor, which uses React Flow and hooks into our logic well.

What we learned

We learnt a lot about using LLMs to empower users so much more than hardcoded logic would have. Some APIs were very easy to use, others were quite challenging. We also learned about real time processing and what that looks like in the context of a multi-component webapp.

What's next for Haptix

We are so excited about the extensibility of Haptix! Among other things, coming features could include a more advanced workflow editor, AR glasses integrations (for visual cues and output).