Inspiration
The Deaf and hard-of-hearing community faces communication barriers every single day — in hospitals, classrooms, workplaces, and even just ordering coffee. Over 70 million Deaf people worldwide use sign language as their primary language, yet most hearing people know zero signs. We asked ourselves: what if technology could instantly bridge that gap in real time? That question became SignFlowAI. We were also inspired by the sheer lack of accessible, beautiful, and actually usable ASL tools — most existing apps feel outdated, clunky, or require expensive hardware. We wanted to build something that felt premium, worked instantly in a browser, and genuinely helped people communicate.
What it does
SignFlowAI is an AI-powered communication platform that bridges the gap between Deaf and hearing communities through three core features: ASL → Speech: Users sign in front of their webcam and our YOLOv8 model (trained on the HandsSpeak dataset via Roboflow) detects 18+ ASL words in real time. Once they stop signing, the detected signs are assembled into a natural sentence and spoken aloud using ElevenLabs' human-sounding voice synthesis. Speech → Sign: Hearing users speak naturally into their microphone and their words are instantly converted into ASL sign icons on screen — helping hearing people communicate back to Deaf users without knowing any sign language themselves. AI ASL Assistant: A Gemini 2.5 flash powered chatbot that acts as a personal ASL tutor — teaching users how to sign words, explaining ASL grammar, sharing Deaf culture knowledge, and providing emergency communication phrases. Every conversation is saved to MongoDB Atlas so users can revisit their learning history anytime.
How we built it
We built SignFlowAI as a full-stack Next.js 14 application using TypeScript throughout.
Frontend: Next.js 14, React, Tailwind CSS, Framer Motion ASL Detection: Roboflow's hosted YOLOv8 model via the Roboflow Inference API AI Chatbot: Google Gemini 1.5 Pro via the Google Generative AI SDK Voice Synthesis: ElevenLabs multilingual v2 model with graceful browser Speech Synthesis fallback Authentication: Clerk for seamless sign-up, login, and session management Database: MongoDB Atlas for persisting chat history per user Speech Recognition: Web Speech API (browser-native) — zero cost, zero latency
Challenges we ran into
Clerk v5 breaking change: Clerk's v5 SDK changed auth() from synchronous to async — but the error it throws is silent and cryptic. Our entire chatbot was returning 500 errors because of a single missing await. Gemini history format: Gemini 1.5 Pro requires strictly alternating user and model roles in conversation history. We had to write a sanitizer that enforces alternating roles before sending to the API. ElevenLabs quota management: The free tier has character limits, so we built a smart fallback — if ElevenLabs fails, the app automatically switches to browser Speech Synthesis without the user ever noticing. MongoDB connection in Next.js dev mode: Hot-reloading creates multiple connections causing pool exhaustion. Solved with a global singleton pattern that reuses the connection across hot reloads. YOLOv8 false positives: The model sometimes detects signs when no hand is present. Solved by deduplicating consecutive identical detections.
Accomplishments that we're proud of
Built a fully working real-time ASL detection pipeline entirely in the browser — no local model installation needed Achieved a seamless ElevenLabs → browser voice fallback so the app never breaks Every chat message saved to MongoDB per authenticated user, creating a genuine personalized learning experience Integrated 5 different APIs (Roboflow, Gemini, ElevenLabs, Clerk, MongoDB) into one cohesive application within the hackathon timeframe
What we learned
ASL is not English with hands — it has its own grammar, syntax, and structure Real-time ML in the browser is surprisingly achievable with Roboflow's hosted inference API API version mismatches are silent killers — always read the changelog Fallback design is as important as the feature itself Accessibility is a design challenge, not just a technical one
What's next for signflowai
Two-way real-time conversation mode — live video call with SignFlowAI as the interpreter layer Mobile app — React Native version for real-world situations like doctor visits or emergencies
Log in or sign up for Devpost to join the conversation.