What Inspired Us

When we forgot where stuff was placed at home, we always asked our moms, and they somehow knew. But what if you don’t have someone you can rely on?

The sad reality is that for people living with dementia, the problem occurs more frequently and is far more stressful. Losing essential items can disrupt their daily routines, increase anxiety, and reduce their independence.

While modern AI systems and XR prototypes recognize objects in real time, they remain ephemeral and have no means of storing long-term memory for important items. Currently, there is no system able to accurately track and remember spatial information.

So we realized that what we were missing wasn't just a tool to find objects, but a form of spatial memory where a machine could remember where things were last seen. That became the foundation of our project: a system designed to help people recall where everyday objects are placed, giving them peace of mind.

What We Learned

Building this project taught us a lot, especially since we were all learning how to use a ROS2 backend, RTAB mapping, and YOLO for object detection. We also designed and implemented a custom depth mapping pipeline, which helped us understand how spatial perception works in real-world robotic systems.

We gained hands-on experience with real-time data streaming, coordinate transformations, and integrating multiple AI and visualization tools into one cohesive system.

How We Built It

Architecture Overview

We built a full-stack spatial memory system consisting of:

Frontend

  • Next.js 16 with TypeScript, featuring a dark “space lab” UI using a custom Tailwind theme.

ROS2 Backend

  • All sensor data integration, mapping, object detection, and depth mapping is handled through ROS2.

3D Visualization

  • Three.js for interactive spatial maps with object confidence visualization.

Voice Interface

  • Web Speech API for voice input
  • ElevenLabs for text-to-speech
  • Cerebras interface platform for near-instant responses

Challenges We Faced

Technical Challenges

Coordinate System Difference

ROS uses XYZ coordinate conventions that differ from Three.js. We were unable to understand why objects appeared “underground” before implementing a custom rosToThree() transformation function to correctly align coordinate spaces.

Real-Time State Synchronization

Managing camera position, detected objects, and navigation guidance states simultaneously required frontend and backend state design to prevent race conditions and data desynchronization.

Voice Recognition Accuracy

Handling different phrasing patterns like:

  • “Where are my keys?”
  • “Find keys”
  • “Keys location”

required building multiple regex matching rules and fallback logic to ensure reliable voice interpretation.

3D Performance Optimization

Rendering hundreds of point cloud points at 60 FPS required careful geometry disposal, batching strategies, and level-of-detail optimizations to keep performance smooth and responsive.

Built With

Share this project:

Updates