BrailleApp | Devpost

Inspiration

Braille is the primary written language for millions of people with visual impairments, yet most people around them can't read it. Existing tools require specialized hardware or manual input. We wanted to make Braille readable using just a phone camera — no extra devices, no prior knowledge required.

What it does

BrailleApp reads Braille text in real time using a camera or uploaded image and converts it to English.

Point your camera at any Braille surface
Dots are detected, snapped to a grid, and decoded into Braille Unicode
Text is translated to English using liblouis
Works live (500ms refresh) or on a still image upload

How we built it

Detection pipeline (runs every frame):

Letterbox resize with a RAM-aware memory guard (psutil)
CLAHE + adaptive threshold + morphology for image preprocessing
HoughCircles for fast, every-frame dot detection
YOLOv8n ONNX model for deep learning detection (every 3rd frame)
Ensemble merges both detectors by proximity; skew correction applied if needed

Grid reconstruction:

Lattice-snap algorithm maps detected dots to Braille cell positions, preserving empty slots

Translation:

liblouis backTranslateString handles Grade 1 and Grade 2 Braille → English

Model training:

YOLOv8n trained on Google Colab (T4 GPU) on a combined Angelina + DSBI dataset (~500 images)
Augmentations: 30° rotation, perspective warp, brightness variation — to handle real-world tilted/lit pages

Stack: Python, Streamlit, OpenCV, ONNX Runtime, liblouis

Challenges we ran into

Low model confidence on small datasets — max observed confidence was 0.28, well below the standard 0.4 threshold. We lowered it to 0.25 after empirical testing, accepting more false positives in exchange for usable detections.
Single-dataset model failed entirely — training on only one dataset gave near-zero confidence (max 0.037). Combining two datasets (Angelina + DSBI) fixed this.
Grid reconstruction from noisy dot positions — real camera images produce imperfect dot clusters. Building a lattice-snap algorithm that tolerates noise while preserving empty Braille cells took significant iteration.
Memory pressure on Streamlit Cloud — live camera processing on a free-tier server required a psutil-based guard to downscale frames when RAM exceeded 700MB.
liblouis on Windows — the translation library is Linux-only via apt. Development required graceful fallback handling.

Accomplishments that we're proud of

A fully working end-to-end pipeline: raw camera frame → English text
Hybrid detection (classical vision + deep learning) that's more robust than either alone
Lattice-based grid reconstruction that correctly handles missing dots and irregular spacing
Runs entirely in a browser with no install — deployed on Streamlit Community Cloud

What we learned

Combining classical computer vision with ML gives better real-world results than relying on either alone
Small training datasets need lower confidence thresholds and heavier augmentation
Dataset quality and diversity matter more than dataset size — merging two modest datasets outperformed one larger single-source dataset
Streamlit's fragment API enables live camera refresh without full page rerenders

What's next for BrailleApp

Expand training data to improve model confidence and accuracy
Support contracted (Grade 2) Braille more robustly across all tables
Mobile-optimized UI for direct phone camera use
Audio output (text-to-speech) for full accessibility
Support for multi-line Braille documents and page-level layout detection

Built With

onnx
python
streamlit
yolov8

Updates

Aakash Khepar started this project — Jun 01, 2026 04:36 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.