Inspiration

Braille is the primary written language for millions of people with visual impairments, yet most people around them can't read it. Existing tools require specialized hardware or manual input. We wanted to make Braille readable using just a phone camera — no extra devices, no prior knowledge required.


What it does

BrailleApp reads Braille text in real time using a camera or uploaded image and converts it to English.

  • Point your camera at any Braille surface
  • Dots are detected, snapped to a grid, and decoded into Braille Unicode
  • Text is translated to English using liblouis
  • Works live (500ms refresh) or on a still image upload

How we built it

Detection pipeline (runs every frame):

  • Letterbox resize with a RAM-aware memory guard (psutil)
  • CLAHE + adaptive threshold + morphology for image preprocessing
  • HoughCircles for fast, every-frame dot detection
  • YOLOv8n ONNX model for deep learning detection (every 3rd frame)
  • Ensemble merges both detectors by proximity; skew correction applied if needed

Grid reconstruction:

  • Lattice-snap algorithm maps detected dots to Braille cell positions, preserving empty slots

Translation:

  • liblouis backTranslateString handles Grade 1 and Grade 2 Braille → English

Model training:

  • YOLOv8n trained on Google Colab (T4 GPU) on a combined Angelina + DSBI dataset (~500 images)
  • Augmentations: 30° rotation, perspective warp, brightness variation — to handle real-world tilted/lit pages

Stack: Python, Streamlit, OpenCV, ONNX Runtime, liblouis


Challenges we ran into

  • Low model confidence on small datasets — max observed confidence was 0.28, well below the standard 0.4 threshold. We lowered it to 0.25 after empirical testing, accepting more false positives in exchange for usable detections.
  • Single-dataset model failed entirely — training on only one dataset gave near-zero confidence (max 0.037). Combining two datasets (Angelina + DSBI) fixed this.
  • Grid reconstruction from noisy dot positions — real camera images produce imperfect dot clusters. Building a lattice-snap algorithm that tolerates noise while preserving empty Braille cells took significant iteration.
  • Memory pressure on Streamlit Cloud — live camera processing on a free-tier server required a psutil-based guard to downscale frames when RAM exceeded 700MB.
  • liblouis on Windows — the translation library is Linux-only via apt. Development required graceful fallback handling.

Accomplishments that we're proud of

  • A fully working end-to-end pipeline: raw camera frame → English text
  • Hybrid detection (classical vision + deep learning) that's more robust than either alone
  • Lattice-based grid reconstruction that correctly handles missing dots and irregular spacing
  • Runs entirely in a browser with no install — deployed on Streamlit Community Cloud

What we learned

  • Combining classical computer vision with ML gives better real-world results than relying on either alone
  • Small training datasets need lower confidence thresholds and heavier augmentation
  • Dataset quality and diversity matter more than dataset size — merging two modest datasets outperformed one larger single-source dataset
  • Streamlit's fragment API enables live camera refresh without full page rerenders

What's next for BrailleApp

  • Expand training data to improve model confidence and accuracy
  • Support contracted (Grade 2) Braille more robustly across all tables
  • Mobile-optimized UI for direct phone camera use
  • Audio output (text-to-speech) for full accessibility
  • Support for multi-line Braille documents and page-level layout detection

Built With

Share this project:

Updates