Inspiration
I wanted to create an accessible tool that breaks down language barriers instantly. Inspired by personal experiences of traveling abroad and facing communication challenges, I imagined a seamless solution: instant translation that feels natural, smooth, and helpful in real-time interactions.
What it does
Comu translates spoken words into live subtitles, instantly displaying the translation on your computer screen. As you speak, Comu continuously listens, translates, and provides clear, large-format subtitles in the target language, making communication effortless and inclusive.
How I built it
I leveraged Python and cutting-edge libraries to achieve fast, reliable speech-to-text conversion using Vosk. For real-time translation, I utilized Google's translation API via googletrans. To ensure readability and visual appeal, I incorporated Pyfiglet for dynamic, large-text display of translations. The result was a responsive, intuitive interface capable of real-time subtitling directly in the terminal.
Challenges I ran into
One significant challenge was optimizing the translation latency—ensuring subtitles appeared as quickly as possible after speaking. Another hurdle was troubleshooting audio input/output issues, requiring detailed adjustments to hardware configurations and software permissions. Additionally, managing asynchronous tasks efficiently in Python proved critical to maintaining real-time responsiveness.
Accomplishments that I'm proud of
I'm proud of creating a fully functional, real-time translation tool within the constraints of the hackathon. I achieved near-instant translations with minimal latency, providing smooth user interaction and a genuinely useful tool for immediate use.
What I learned
Throughout this project, I deepened my understanding of real-time audio processing, asynchronous programming, and API integration. I also learned valuable lessons about rapid prototyping, debugging complex issues quickly, and improving user interface readability for real-time applications.
What's next for Comu
I plan to extend Comu by developing a graphical user interface (GUI) for broader accessibility and ease of use. Future enhancements include adding support for multiple languages simultaneously, mobile app integration, and improving translation accuracy with advanced language models and machine learning techniques.
Built With
- google-translate
- json
- os
- spyder
- vosk
Log in or sign up for Devpost to join the conversation.