AI Eyes | Devpost

Inspiration

Many visually impaired individuals still struggle with daily tasks that sighted people take for granted such as reading labels, identifying objects, or simply understanding their surroundings. Existing tools are either too slow, too complex, or require expensive hardware. We wanted to build a solution that works instantly, anywhere, on any mobile device.

What it does

AI Eyes is a real-time mobile web assistant that acts as “digital vision” for visually impaired users. Through the phone camera and a voice interface, AI Eyes can:

Describe the user’s environment
Identify objects, signs, or people
Read text such as labels, menus, medication boxes
Respond conversationally to questions like “What am I looking at?” Everything is delivered through immediate audio feedback—no typing, no waiting, just natural interaction. ## How we built it
Frontend: A lightweight mobile-optimized web app built with React, capturing video/audio directly from the device.
Backend: A Python microservice handling image frames and speech queries using LangChain and multimodal LLMs. Real-time processing: Streaming inference for faster responses, optimized to avoid long LLM “thinking time.”
Audio pipeline: Browser mic input → backend → speech-to-text → AI reasoning → TTS response streamed back to user.
Deployment: Hosted on a cloud VM with HTTPS, tuned for low-latency API calls. ## Challenges we ran into
Ensuring real-time responsiveness on mobile networks.
Balancing local vs cloud inference to control costs and achieve speed.
Cloud hosting costs especially for inference, compute and domain name. ## Accomplishments that we're proud of
Achieved nearly instant environment descriptions using streaming inference.
Built a fully accessible voice interface compatible with mobile browsers.
Created a simple, no-install solution that anyone can use with just a URL.
Successfully tested object detection, reading text, and Q&A in real-world scenarios. ## What we learned
Real-time AI for accessibility requires speed first, accuracy second.
Accessibility design goes far beyond UI, it’s about predictability, trust, and clear feedback.
Vision + voice AI can dramatically improve independence for visually impaired users. ## What's next for AI Eyes
Add offline/edge inference for faster processing and reduced cloud cost.
Build scene memory so the assistant can track objects over time.
Introduce gesture or voice-only navigation for full hands-free use.
Partner with Singapore accessibility organizations to pilot real-world deployment.

Built With

amazon-web-services
fastapi
langchain
llama
openai
opencv
python
react

Updates

Derrick Woo started this project — Dec 10, 2025 08:49 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.