Inspiration
Vision impairment impacts so many lives. In the U.S. alone, 7 million people live with vision impairment, including 1 million who are completely blind. Among kids, about 6.8% under 18 have an eye condition, and 3% are blind or visually impaired even with glasses or contacts.
At the same time, research shows that children who are read to regularly hear 290,000 more words by the time they start kindergarten than those who aren’t. It’s also linked to better school performance, improved mental health, and less time spent glued to screens.
StoryVue is our way of making sure every child has a chance to explore books, learn, and grow.
What it does
StoryVue is a voice-powered reading assistant that helps visually impaired children read independently.
Using OCR (Tesseract) and OpenAI’s language model, the app reads printed text out loud and can even summarize or explain what’s on the page. It’s completely hands-free, so kids can interact using just their voice with no need to press buttons or see the screen.
Key Features
- Real-time OCR (Optical Character Recognition) to capture text instantly
- AI summaries to help kids understand complex content
- Speech-to-Text (STT) for full voice control
- Text-to-Speech (TTS) for natural, easy listening
- Node.js + Express.js backend to keep everything running smoothly
- TensorFlow to build smart, adaptive features
- LiveKit API for future collaborative reading sessions
How we built it
We started by combining real-time OCR (Optical Character Recognition) and AI language processing with voice technology to create a smooth, fully accessible experience.
- TensorFlow – Used to train a model that detects whether a book is in the camera frame and provides a confidence rating before running OCR.
- Camera Capture – Streams live video so we can identify the book in real-time and send clean frames to the OCR system.
- Tesseract.js OCR – Extracts text from the captured images quickly and efficiently.
- Text Extraction Pipeline – Cleans and organizes the scanned text before sending it to the backend.
- Node.js + Express.js Backend – Acts as the core hub, connecting all services and managing requests.
- OpenAI GPT-3.5 API – Generates natural, conversational reading experiences, explains tricky concepts, and creates summaries for better understanding.
- LiveKit API – Handles for our real-time audio and video streaming platform.
- HTML + CSS – Creates our accessible interface, giving kids the reading experience without needing to see or touch the screen.

Challenges we ran into
- Speech-to-Text worked great early on, but TTS gave us a lot of trouble and didn’t always sound natural or consistent.
- Tesseract would work one moment and completely crash the next due to outdated libraries and git pull/push requests for the team.
- Getting all our libraries and services to play nicely together was a challenge in itself.
Accomplishments that we're proud of
- Integrating OpenAI’s LLM to make reading interactive and dynamic!
- Achieving real-time OCR!
- Building voice controls that make the app hands-free!
- With these achievements, we’ve made our most comprehensive app to date while also tackling an issue we are passionate about!!
What we learned
- User-first design when creating for groups with disabilities.
- How to troubleshoot and stabilize open-source libraries for real-time performance.
- How tricky it is to combine multiple advanced tools (OCR, AI, voice tech) into one smooth experience.
What's next for StoryVue
- Collaborative Reading: Allow multiple users to read together remotely using LiveKit.
- Parent/Teacher Portal: Track progress, identify challenging words, and suggest follow-up learning activities.
- New Subjects: Read and interpret math equations & desribe images and diagrams for history and science so kids don’t miss out on visual learning.
Built With
- css
- express.js
- html
- javascript
- node.js
- openai
- tensorflow





Log in or sign up for Devpost to join the conversation.