Inspiration

Many visually impaired individuals still struggle with daily tasks that sighted people take for granted such as reading labels, identifying objects, or simply understanding their surroundings. Existing tools are either too slow, too complex, or require expensive hardware. We wanted to build a solution that works instantly, anywhere, on any mobile device.

What it does

AI Eyes is a real-time mobile web assistant that acts as “digital vision” for visually impaired users. Through the phone camera and a voice interface, AI Eyes can:

  • Describe the user’s environment
  • Identify objects, signs, or people
  • Read text such as labels, menus, medication boxes
  • Respond conversationally to questions like “What am I looking at?” Everything is delivered through immediate audio feedback—no typing, no waiting, just natural interaction. ## How we built it
  • Frontend: A lightweight mobile-optimized web app built with React, capturing video/audio directly from the device.
  • Backend: A Python microservice handling image frames and speech queries using LangChain and multimodal LLMs. Real-time processing: Streaming inference for faster responses, optimized to avoid long LLM “thinking time.”
  • Audio pipeline: Browser mic input → backend → speech-to-text → AI reasoning → TTS response streamed back to user.
  • Deployment: Hosted on a cloud VM with HTTPS, tuned for low-latency API calls. ## Challenges we ran into
  • Ensuring real-time responsiveness on mobile networks.
  • Balancing local vs cloud inference to control costs and achieve speed.
  • Cloud hosting costs especially for inference, compute and domain name. ## Accomplishments that we're proud of
  • Achieved nearly instant environment descriptions using streaming inference.
  • Built a fully accessible voice interface compatible with mobile browsers.
  • Created a simple, no-install solution that anyone can use with just a URL.
  • Successfully tested object detection, reading text, and Q&A in real-world scenarios. ## What we learned
  • Real-time AI for accessibility requires speed first, accuracy second.
  • Accessibility design goes far beyond UI, it’s about predictability, trust, and clear feedback.
  • Vision + voice AI can dramatically improve independence for visually impaired users. ## What's next for AI Eyes
  • Add offline/edge inference for faster processing and reduced cloud cost.
  • Build scene memory so the assistant can track objects over time.
  • Introduce gesture or voice-only navigation for full hands-free use.
  • Partner with Singapore accessibility organizations to pilot real-world deployment.

Built With

Share this project:

Updates