Inspiration

In emergency scenarios like fires, earthquakes, or search-and-rescue operations, swiftly locating people or critical objects can mean the difference between life and death. Current methods require human responders to enter hazardous environments, risking injury and wasting precious time. Even in everyday contexts, manually searching for misplaced items in large spaces (e.g., warehouses, hospitals, or homes) is inefficient and labor-intensive. The challenge lies in creating a reliable, automated system for these tasks that can reduce human risk, save time, and adapt to diverse search scenarios.

What it does

Our solution is TSR, an autonomous robot designed to locate objects or individuals rapidly and safely. Users input a target object (e.g., “fire extinguisher” or “person”) via a simple interface, and the robot employs AI-powered computer vision and control systems principles to navigate environments, scanning its surroundings in real-time. TSR drives autonomously until it identifies the target, then moves toward it and notifies users with an attention-grabbing beeping noise. This system minimizes human exposure to danger, accelerates search processes, and is scalable for applications ranging from disaster response to everyday use. By combining affordability with adaptability, TSR aims to revolutionize how we approach search tasks, offering a proof-of-concept prototype with lifesaving potential.

How we built it

We started by brainstorming our goals, taking into account the advantages of our robotic car called “the trike”—specifically, its ability to turn its wheels while moving without needing to stop. We connected the Raspberry Pi-powered robotic car to a server and a remote host, first testing basic motion capabilities like moving the camera up and down, steering left and right, and performing a full 360-degree rotation. Next, we experimented with YOLO models for real-time object detection to evaluate feasible navigation paths. However, YOLO proved to be too slow for our needs, which prompted us to explore quantization methods using ONNX and NCNN. Ultimately, we opted for a smaller model like MobileNet to improve FPS. After further research, however, we decided on MobileCLIP due to its zero-shot capabilities, allowing it to determine how closely a given text prompt aligns with an image. We incorporated this into our pipeline, enabling the robot to rotate and identify a desired object before autonomously moving toward it.

Challenges we ran into

One key challenge we faced was developing a model and data pipeline that could deliver efficient inference time, particularly given the hardware constraints of a Raspberry Pi 4. While the Pi 4 is a capable edge computing device, we encountered significant delays due to the compilation and execution of our models, which prevented real-time performance. Hence, we experimented with a few workarounds, including lowering the resolution of the images captured by the camera feed and trying different quantization methods, which reduce the computational and memory costs of running inference by representing model parameters and activations with lower-precision numerical values.

Accomplishments that we're proud of

We are particularly proud of our accomplishments in optimizing Apple’s MobileCLIP model for efficient inference, ensuring it runs smoothly on resource-constrained hardware like the Raspberry Pi. By leveraging various model optimizations, we reduced computational overhead while still maintaining strong performance. A key achievement was minimizing inference latency to under 1000 milliseconds, allowing for near real-time processing despite the device’s limited processing power.

What we learned

We gained a deeper understanding of deploying AI models on low-power hardware learning how hardware constraints, including limited memory space, impact model selection and performance. Optimizing inference speed required us to explore different quantization methods and rethink our approach when YOLO proved too slow. We also learned the value of adaptability—quickly testing and refining our methods to find balance between accuracy and efficiency. Beyond the technical aspects, we strengthened our problem-solving skills by troubleshooting motion control and designing a pipeline that allows the robot to make autonomous decisions. This experience highlighted the challenges of working with edge computing and the importance of efficient model design for real-world applications.

What's next for TSR

We want to enhance the robot’s decision-making by incorporating reinforcement learning or more advanced planning algorithms to make its movements smarter and more efficient. Another key step is improving the system’s adaptability—expanding its ability to recognize a wider range of objects and environments. Additionally, we aim to integrate better communication between the Raspberry Pi and the server, possibly leveraging edge computing strategies to reduce latency even further. Ultimately, we want to push the limits of what’s possible on low-power hardware, making our robotic system more autonomous, responsive, and capable in real-world applications.

Built With

Share this project:

Updates