Where hands meet AI to master ASL

Here's a comprehensive overview of SpellWithASL!

Inspiration

We were inspired by the need to make American Sign Language (ASL) more accessible to learners everywhere. Traditional ASL learning often requires in-person instruction or static resources, creating barriers for many people who want to learn this beautiful language. We wanted to create an interactive, AI-powered platform that could provide real-time feedback and make ASL spelling practice engaging and accessible from anywhere.

What it does

SpellWithASL is an interactive web application that teaches users to spell words using American Sign Language through real-time AI gesture recognition. Users can:

Practice ASL spelling with real-time hand gesture recognition using their webcam
Learn with instant feedback as the AI recognizes each letter and provides confidence scores
Progress automatically through a curated vocabulary of 30+ practice words
Track their learning with comprehensive statistics on letters and words completed
Collect training data to continuously improve the AI model's accuracy
Enjoy a seamless experience with celebration animations and smooth word transitions

The system recognizes hand landmarks in real-time and uses a trained neural network to identify ASL letters, providing immediate feedback to help users improve their signing accuracy.

How we built it

Frontend Architecture:

Built with Next.js and React using TypeScript for type safety
Integrated MediaPipe Hands for real-time hand landmark detection
Implemented custom camera handling with retry logic and error recovery
Designed a clean, minimal UI with consistent color palette and responsive design

Backend & AI:

Developed a FastAPI service for real-time ASL letter prediction
Created a TensorFlow/Keras neural network trained on hand landmark coordinates
Implemented landmarks-only architecture (no images) for privacy and performance
Used scikit-learn for data preprocessing and model validation
Applied class balancing and data augmentation to handle training data imbalance

Key Technical Decisions:

Landmarks-only approach: Chose hand coordinate data over images for better privacy, performance, and real-time processing
Microservices architecture: Separated AI inference, backend API, and frontend for scalability
Robust state management: Implemented comprehensive timing controls and validation to prevent auto-completion bugs
Progressive enhancement: Added automatic word progression with celebration feedback

Challenges we ran into

1. Architecture Evolution: Started with an image-based approach but realized landmarks-only was superior for performance, privacy, and real-time processing. Required major refactoring.

2. Camera Stability: Encountered black screen issues and camera initialization failures. Solved with comprehensive retry logic, timeout handling, and manual restart functionality.

3. State Management Complexity: Faced auto-completion bugs where completing one letter would accidentally trigger the next letter. Fixed with robust validation, state cleanup, and timing controls.

4. Model Training Challenges: Dealt with class imbalance in training data and needed to implement proper hand pose normalization and data augmentation techniques.

5. Real-time Performance: Balanced prediction accuracy with real-time responsiveness, implementing throttling and confidence thresholds.

Accomplishments that we're proud of

🎯 Technical Achievements:

Built a fully functional real-time ASL recognition system that works in web browsers
Achieved seamless user experience with automatic word progression and celebration animations
Implemented privacy-first architecture using landmarks instead of images
Created robust error handling with automatic camera recovery and manual restart options

🎨 User Experience:

Designed a beautiful, minimal interface with consistent design language
Built engaging learning flow with progress tracking and visual feedback
Implemented accessibility features and responsive design

🤖 AI/ML Success:

Trained a working neural network with good accuracy across 26 letters
Implemented proper data preprocessing with hand pose normalization
Created comprehensive training pipeline with model evaluation and testing

What we learned

Technical Insights:

Computer vision in the browser is powerful but requires careful optimization for real-time performance
State management in real-time applications needs robust validation and cleanup to prevent race conditions
Landmarks-based ML can be more effective than image-based approaches for gesture recognition
User experience design is crucial for educational applications - smooth interactions keep users engaged

AI/ML Learnings:

Data quality over quantity: Proper normalization and preprocessing matter more than just having lots of data
Class balancing is essential when training on human-generated gesture data
Real-time inference requires different considerations than batch processing

Product Development:

Progressive enhancement: Starting simple and adding features incrementally leads to more stable systems
User feedback loops: Real-time validation and celebration animations significantly improve learning engagement

What's next for SpellWithASL

🚀 Immediate Improvements:

Expand vocabulary to full ASL dictionary beyond just spelling
Add word difficulty levels and adaptive learning paths
Advanced gesture recognition for full ASL phrases and sentences
Mobile apps for iOS and Android

SpellWithASL represents our vision of making sign language education accessible, engaging, and effective through the power of AI and modern web technologies.

Built With

fastapi
keras
mediapipe
next.js
node.js
python
react.js
scikit-learn
tailwindcss
tensorflow
typescript

Submitted to

SolutionHacks

Created by

I worked on the ML side by creating a lightweight ML pipeline (from preprocessing to training and deployment). As well I also worked and oversaw the macro level integration of the frontend, backend, and ai-service microservice.

Mustafa Elzowawi
I worked on the front end with the webcam integration and ASL hand sign integration.

Adan Khalid
Kavin Ainkaran

Updates

Mustafa Elzowawi started this project — Jun 29, 2025 07:54 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.