PixReader

Inspiration

I wanted to create an assistive technology that was more robust than what's currently available. Web developers don't always provide alt-text for their images and I wanted to create an alternative to depending on alt-text.

What it does

PixReader is an assistive screen reader that reads text and auto-generated captions for images with the help of computer vision!

How I built it

This was built with Microsoft Cognitive Services (Describe Image), Google Cloud Text-to-Speech, and all strung together with python3.

Challenges I ran into

I initially tried to build the screen reader portion of the project with Google's Web Speech API, but there's minimal documentation for it and the only resource I could find was a blog post translated from Polish. Additionally, after implementing Microsoft Cognitive Services for the image caption generation, I tried to use Google's "im2text" model because I was impressed by its superior results. Sadly, all the pre-trained models I found online were sub-par and training it myself, even with my GPU, would take weeks. Not a luxury provided by TAMUhack.

Accomplishments that I'm proud of

I was in an unfortunate seating arrangement where I needed to stay where I was to use the ethernet cable and didn't really have room for teammates to join me, so I'm proud that I finished something so lofty both by myself and so quickly!