Inspiration
In our interconnected world, language remains the last great barrier to global collaboration. We've witnessed brilliant minds unable to share ideas, families struggling to stay connected across continents, and businesses losing opportunities due to communication gaps. The "Hacker-Verse" theme challenged us to transcend physical limitations. We realized that language barriers create invisible walls between people, fragmenting our digital universe into isolated linguistic islands. Fluent.io was born from a simple vision: what if everyone could communicate naturally, regardless of the language they speak?
What it does
Fluent.io is a web-based video calling platform that enables real-time multilingual conversations. Users join a video call, select their preferred language, and speak naturally. Our platform:
- Translates speech in real-time: Captures speech, translates it, and synthesizes it in the listener's language
- Preserves voice characteristics: Using ElevenLabs AI, we maintain the speaker's emotion, tone, and pace
- Provides live subtitles: Displays real-time captions for accessibility and clarity
- Supports 29+ languages: From major languages to regional dialects
- Works instantly: No downloads, plugins, or setup required - just open your browser
Example: when a Spanish speaker talks, English listeners hear natural English speech in near real-time, complete with appropriate emotional inflection. It feels like everyone is speaking the same language.
How we built it
Our architecture leverages cutting-edge web technologies and AI services:
Frontend Architecture:
- React 18 for responsive UI with real-time state management
- Three.js for immersive visual effects aligned with the Hacker-Verse theme
- Tailwind CSS for rapid, consistent styling
- WebRTC for peer-to-peer video/audio streaming
Backend Systems:
- Node.js with TypeScript for type-safe server development
- Socket.IO for WebRTC signaling and real-time event handling
- Supabase for authentication and session management
AI Translation Pipeline:
- Web Speech API captures and transcribes speech in the original language
- Custom translation service processes the text
- ElevenLabs Conversational AI generates natural speech in the target language
- WebRTC streams the translated audio to participants
The complete pipeline achieves end-to-end translation in under 300ms:
Latency_total = T_capture + T_STT + T_translate + T_TTS + T_stream
Where we optimized each component to minimize T_total < 300ms
Optimization Techniques:
- Implemented audio chunking for streaming translation
- Used WebWorkers to prevent UI blocking during processing
- Deployed edge functions for reduced latency
Challenges we ran into
Latency Optimization
The biggest challenge was minimizing the delay between speech and translation. Initial attempts had 2-3 second delays, making conversations impossible. We solved this by:
- Implementing incremental speech processing
- Optimizing API calls with batching
- Using predictive text completion for common phrases
Accomplishments that we're proud of
- Sub-300ms translation pipeline: Achieved near-instantaneous translation that enables natural conversation
- Emotional intelligence: Successfully preserved tone, emotion, and speaking style across languages
- Scalable architecture: Designed system to handle multiple concurrent conversations
- Accessibility focus: Implemented live captions making the platform inclusive
- Zero-installation requirement: Built entirely with web technologies for instant access
What we learned
- WebRTC mastery: Gained deep understanding of real-time communication protocols
- AI service integration: Learned to orchestrate multiple AI services for seamless UX
What's next for Fluent.io
- Implement SFU (Selective Forwarding Unit) for rooms with 5+ participants
- Add meeting recording with multilingual transcripts
- Create mobile applications for iOS and Android
- AR/VR integration: Bring Fluent.io to spatial computing platforms
Built With
- auth-0
- elevenlabs
- google-web-speech-api
- javascript
- node.js
- react
- socket.io
- supabase
- tailwindcss
- typescript
- webrtc



Log in or sign up for Devpost to join the conversation.