SignHero Title
GuitarHero Mode
JustDance Mode
Whack-a-Sign Mode
Training/Learning Mode
Saving Player Name
Viewing Report

SignHero 🤟

Inspiration

Over 48 million Americans are deaf or hard of hearing, yet ASL literacy remains low among the hearing population. We wanted to make learning ASL fingerspelling fun, accessible, and engaging—the same way Guitar Hero made learning rhythm patterns addictive.

The breakthrough idea: What if we could gamify sign language learning with real-time AI detection and rhythm-based gameplay?

What It Does

SignHero is a rhythm game that teaches ASL fingerspelling through webcam-powered gameplay:

Mode	Description
🎸 Song Game	Sign along to beatmaps synced with music—notes scroll down a Guitar Hero-style highway
📚 Training Mode	Step-by-step practice with visual hand pose hints
⏱️ Testing Mode	Timed challenges to measure proficiency
🕹️ Whack-A-Sign	Arcade-style reflex game for quick recognition

The AI watches your webcam, detects which ASL letter you're signing in real-time (~30-50ms latency), and scores your accuracy with combo multipliers, streak celebrations, and visual effects.

How We Built It

🧠 Machine Learning Pipeline!

For this we build a custom model based on the mobileneyb2 cnn as our backbone. The model has 2 main parts, the feature extraction and the classification! For extraction we have 32 channels, and use 5 specialized attention layers! We also have a custom classifier with a standard RELU activation function and filter that finalizes converting the single frame into 26 different classes!

We used various modified training data in order to be able to have a consistent model with different angles to make this compatible with various people to remove bais and more! There is alot more documentation on architecture training preprocessing and more in the folder here: https://github.com/MsMarion/ASL-Fun-Training/tree/main/Base%20test/Sign-Language-Recognition/documentation

Webcam Frame → MediaPipe (hand landmarks) → Feature Extraction → MobileNetV2 CNN → Letter Prediction

MediaPipe Hands detects 21 hand landmarks from webcam frames
Landmarks are drawn as a feature mask on a black background
MobileNetV2 (trained on ASL alphabet data) classifies the pose into A-Z
Both original and mirrored images processed; max confidence used

🎮 Full-Stack Architecture

Layer	Technology
Frontend	Next.js 15, React 19, TypeScript, Tailwind CSS 4
Animation	Framer Motion (smooth transitions, particle effects)
API	tRPC + React Query for type-safe data fetching
Database	MongoDB via Prisma ORM
ML Server	FastAPI (Python) serving PyTorch model
Hand Tracking	MediaPipe for real-time landmark detection

🎨 Visual Design

We built a synthwave aesthetic with:

Neon grids, palm trees, animated sun
Screen flash & shake on hits/misses
Guitar Hero-style streak glow effects
Particle bursts and floating score text

Challenges We Faced

⚡ Latency Optimization

Real-time gameplay requires sub-100ms detection. We achieved ~30-50ms by:

Using binary WebSocket protocol for frame transmission
Running MediaPipe + MobileNetV2 on separate threads
Processing both original and mirrored hand poses for better accuracy

🎯 Sign Recognition Accuracy

Hand signs are subtle—slight angle changes affect predictions. We improved accuracy by:

Training on hand landmark feature masks rather than raw images
Using mirrored predictions (max of both) to handle left/right hands
Implementing confidence thresholds to filter noise

🎵 Timing Synchronization

Syncing game timing with AI predictions was tricky:

Note timing window: [noteTime - 2.0s, noteTime + 0.8s]
Perfect window:     [noteTime - 0.3s, noteTime + 0.3s]

We buffer predictions and match them against note windows in real-time.

What We Learned

MediaPipe is incredibly fast for hand tracking (~5-10ms)
MobileNetV2 provides a great balance of accuracy vs. speed for real-time inference
Building rhythm games requires careful attention to input latency
Framer Motion makes complex animations surprisingly approachable

What's Next

🌐 Community beatmap creation and sharing
📱 Mobile app with on-device ML (Core ML / TensorFlow Lite)
🏆 Multiplayer competitive modes
📊 Learning analytics and progress tracking
🤲 Support for ASL words and phrases beyond fingerspelling (dataset generated and training started)