ScenedSound | Devpost

LiDAR Scan of High Top Table and Bar Stool in Marshall Center
Gaussian Denoising of Laplacian for Edge Detection
Original Surroundings while Testing LiDAR
LiDAR Scan of Table, Boxes, and Items for Testing
Laplacian of LiDAR Data
Gaussian Denoising of Laplacian

Inspiration

At our school, the University of Florida, we have a bat house with thousands of bats. Every sunset, dozens of spectators watch as the bats leave their home and fly into the dark night. The echolocation of bats made us curious of how echolocation could be used for human navigation.

What it does

With a recent enough iPhone, blind and visually impaired users can use the LIDAR sensors on their phone to see for them. Holding their phone in front of them, the user can detect when there is an obstacle in their path, such as a wall, stairs, and tripping hazards. The app provides warnings via increasingly frequent beeps, effectively acting as a digital white cane. An added twist is added using Spatial Sound, which has the effect of placing a sound at a particular 3-d point when wearing headphones. We made it so the user hears the beeps in front of them, to emphasize that there is an obstacle there. In addition to this, our app offers mapping and navigation features. Users can voice their desired destination to the app, which captures their speech and starts the navigation. The app gives intermittent navigational instructions, just like a GPS. Using both of these features in tandem, we hope the visually impaired and blind will be able to have a much smoother travelling experience.

How we built it

We developed the iOS application using Swift. Through Swift, we accessed many iPhone APIs, including LIDAR data, Spatial Sound, Siri Text to Speech, and more.
We built the backend using Node.js and Express.js. We served a REST API for our app in order to easily interface with the mapping and speech to text APIs.
We transcribed user’s voices using the Google Gemini API.
Navigation was done through the Google Maps API. Given the user’s transcribed commands, we asked Google Maps to find the route from the user’s current location to the destination.

Challenges we ran into

🦇LIDAR is challenging to use because its range is limited and it has a wide field of view. This made it a bit challenging to develop the object detection algorithm, since the LIDAR would constantly detect the floor as the nearest object, giving us false positives for obstructions. To remedy this, we only considered a segment of the lidar data not pointing into the ground

🔊 Spatial Sound was very challenging to work with because it adds an extra element to surround sound. Getting Spatial Sound (i.e. compass-like sound) to accurately place sources of sound in the space around the user was challenging because it involved many calculations, especially compared to localized and relative. We had to consider the orientation of the phone and the headphones, both with respect to each other and with respect to true north, which involved many sensor measurements. A feature of Spatial Sound very important to us was to have the source of sound remain stationary while the user rotated their head. This would give the illusion that the sound is coming from outside of the headphones, resulting in a more immersive experience.

📱 Swift and iOS development were very challenging because our team was very inexperienced in iOS development. We had practically no iOS development experience, and only two of our team members even had Macbooks. To mitigate the issue of having few usable computers, we tried to push work to the backend, which could be done by anyone. We also utilized pair programming.

🚅HTTP Requests were challenging because our use case is very latency sensitive. There were also some difficulties joining the HTTP server and the iOS client, since Swift does some strange things relating to HTTP.

🌎 GPS and Navigation were difficult to work with, since coordinates are denoted in latitude and longitude, different from linear units like meters and miles that we are familiar with. It is also incredibly difficult to debug programs which work with GPS, since you have to physically go outside and move around over large distances. With a GPS simulator, it is possible to do it indoors, but we do not have tens of thousands to spend on one of those.

📕MongoDB Atlas: We originally wanted to use MongoDB Atlas as our backend database. However, after setting up the database online, attempting to connect a javascript program would never go through. It seemed that many other people at the hackathon had the problem, and some hypothesized that it was due to the USF network blocking traffic on certain ports.

Accomplishments that we're proud of

We are proud to have done so much complex iOS development. At the beginning of the weekend, we were total noobs at iOS development . But now, we have explored many mobile development topics, spanning sensor usage, to stereo and even more immersive audio, to GPS data.
We were also able to apply our mathematical and spatial reasoning skills to a relatively complex aggregation of information in order to simulate the spatial audio correctly.

What we learned

LIDAR
Spatial Sound
iOS Development
Object Detection
Global Positioning System
HTTP Server
Gyroscopic Orientation

What's next for ScenedSound

More discrete: one of the main things that makes this better than using other aids is that this one draws the least amount of attention to yourself. By using bone conduction headphones, the user will be able to navigate while still perceiving the ambient sound.
Stair Detection and improved object detection: We developed an edge detection algorithm for LIDAR data which reveals obstacles. It would be nice to have a voice actively identify obstacles for the user, so we’d like to implement our edge detection algorithm and object detection models in conjunction in the future
More accessible: We understand that our users are those who may struggle to interact with our application because of their vision. We’d like to increase the number of audio cues and add more features such as a lost mode in case the user deviates from the path too much. Additionally, we acknowledge that with improvements any type of person would find a realistic soundscape appealing

Built With

3d-audio
apis
gemini
ios
javascript
lidar
maps
navigation
server
spatial-sound
speech-to-text
swift

Submitted to

Hackabull 2025

Created by

I worked on the front-end with SwiftUI and got the audio recording for verbal commands

Andrew Tang
I worked on the design integration and all manner of things XCode.

Ricky Zhang
I mainly worked on the LiDAR depth sensing for object detection and the spatial audio for point beacons and collision warnings

Cole Smith
I developed the backend, integrated it with the Gemini and Google Maps APIs. I worked a little bit on the frontend iOS app, too.

Anthony Yao