ReStory | Devpost

ReStory Wearable (Rear)
ReStory Wearable (3/4 Rear)
One-Shot Face Identification
Speech Diarization Logs (and Optimism)
Gemini Speech Transcript
ReStory Home Page
ReStory Person Page
ReStory People Page

Inspiration

For people living with Alzheimer’s, the world can become a room of strangers. This loss of context leads to deep social anxiety. We built ReStory to act as a cognitive safety net - a wearable social second brain that restores a patient's dignity and agency by remembering faces and shared history so they don't have to.

What it does

ReStory is a multi-modal AI wearable that fuses computer vision with neural audio processing:

Biometric Identity Fusion: Captures 512-d ArcFace facial and 192-d ECAPA-TDNN vocal embeddings to create a unique signature for every contact.
Wearer-First Locking: Uses a custom Wearer Anchor to explicitly isolate the user’s own voice, ensuring the AI never confuses the wearer's thoughts with the participant's speech.
Neural Diarization: Performs real-time speaker separation to ensure conversation threads are never crossed.
Real-time Memory Feed: A mobile dashboard automatically sorts contacts by interaction frequency, highlighting the person you are currently talking to and displaying their face for instant verification.

How we built it

We utilized a distributed edge-to-GPU architecture:

Edge Ingress: A Raspberry Pi 5 streams 30FPS video and PCM audio over a custom low-latency binary WebSocket.
Vision & Audio Stack: Uses InsightFace for high-precision biometric recognition and OpenCV for Spatial Zone Locking to maintain identity even when faces move or turn.
Neural Synthesis: Gemini 3 Flash processes diarized transcripts into synthesized JSON lore, merging new facts with existing baseline knowledge.
Dynamic Frontend: A React-based mobile dashboard manages the "active speaker" state, sorting the social mental map by frequency and recency.

Challenges we ran into

Identity Drift: Managing how a person’s voice changes due to room acoustics was a major hurdle. We implemented Centroid Evolution and Silence Protection to "lock" voice candidates to a face rather than creating "new" strangers every session.
The Wearer Paradox: Filtering out the wearer's own voice in a wearable form factor is difficult. We solved this with a Probabilistic Linker that uses spatial co-occurrence to attribute lore correctly.
Blackwell 5090 Integration: As early adopters of the RTX 5090, we had to patch Torch 2.4+ security restrictions (weights_only) to load our model weights on Blackwell silicon.
NumPy Scalar Safety: We solved persistent logic crashes by hardening our linker against the multi-element arrays returned by ChromaDB vector distance queries.

Accomplishments that we're proud of

Seeing the whole pipeline work end to end was incredible. From mocking up a real-world situation and role-playing as a new coworker, to lunch buddy, to family friend, it was really cool to see how the memory map was automatically updated and synthesized over time.

What we learned

The hardest part of multi-modal AI is state orchestration. Managing the lifecycle of a person - from a bounding box to a voice embedding to a synthesized JSON lore object - requires a rigorous data integrity layer.

What's next for ReStory

This is one of our favorite Hackathon ideas so far and we think it can 100% be pursued further. Integrating this into dedicated wearable hardware like Meta Ray-Bans using their SDK would be a powerful next step to provide an even more discrete and impactful, hands-free, discrete experience for patients.