Inspiration

In our interconnected world, language remains the last great barrier to global collaboration. We've witnessed brilliant minds unable to share ideas, families struggling to stay connected across continents, and businesses losing opportunities due to communication gaps. The "Hacker-Verse" theme challenged us to transcend physical limitations. We realized that language barriers create invisible walls between people, fragmenting our digital universe into isolated linguistic islands. Fluent.io was born from a simple vision: what if everyone could communicate naturally, regardless of the language they speak?

What it does

Fluent.io is a web-based video calling platform that enables real-time multilingual conversations. Users join a video call, select their preferred language, and speak naturally. Our platform:

  • Translates speech in real-time: Captures speech, translates it, and synthesizes it in the listener's language
  • Preserves voice characteristics: Using ElevenLabs AI, we maintain the speaker's emotion, tone, and pace
  • Provides live subtitles: Displays real-time captions for accessibility and clarity
  • Supports 29+ languages: From major languages to regional dialects
  • Works instantly: No downloads, plugins, or setup required - just open your browser

Example: when a Spanish speaker talks, English listeners hear natural English speech in near real-time, complete with appropriate emotional inflection. It feels like everyone is speaking the same language.

How we built it

Our architecture leverages cutting-edge web technologies and AI services:

Frontend Architecture:

  • React 18 for responsive UI with real-time state management
  • Three.js for immersive visual effects aligned with the Hacker-Verse theme
  • Tailwind CSS for rapid, consistent styling
  • WebRTC for peer-to-peer video/audio streaming

Backend Systems:

  • Node.js with TypeScript for type-safe server development
  • Socket.IO for WebRTC signaling and real-time event handling
  • Supabase for authentication and session management

AI Translation Pipeline:

  • Web Speech API captures and transcribes speech in the original language
  • Custom translation service processes the text
  • ElevenLabs Conversational AI generates natural speech in the target language
  • WebRTC streams the translated audio to participants

The complete pipeline achieves end-to-end translation in under 300ms:

Latency_total ​= T_capture ​+ T_STT​ + T_translate ​+ T_TTS​ + T_stream​

Where we optimized each component to minimize T_total < 300ms

Optimization Techniques:

  • Implemented audio chunking for streaming translation
  • Used WebWorkers to prevent UI blocking during processing
  • Deployed edge functions for reduced latency

Challenges we ran into

Latency Optimization

The biggest challenge was minimizing the delay between speech and translation. Initial attempts had 2-3 second delays, making conversations impossible. We solved this by:

  • Implementing incremental speech processing
  • Optimizing API calls with batching
  • Using predictive text completion for common phrases

Accomplishments that we're proud of

  • Sub-300ms translation pipeline: Achieved near-instantaneous translation that enables natural conversation
  • Emotional intelligence: Successfully preserved tone, emotion, and speaking style across languages
  • Scalable architecture: Designed system to handle multiple concurrent conversations
  • Accessibility focus: Implemented live captions making the platform inclusive
  • Zero-installation requirement: Built entirely with web technologies for instant access

What we learned

  • WebRTC mastery: Gained deep understanding of real-time communication protocols
  • AI service integration: Learned to orchestrate multiple AI services for seamless UX

What's next for Fluent.io

  • Implement SFU (Selective Forwarding Unit) for rooms with 5+ participants
  • Add meeting recording with multilingual transcripts
  • Create mobile applications for iOS and Android
  • AR/VR integration: Bring Fluent.io to spatial computing platforms

Built With

Share this project:

Updates