Interactable - Hacklahoma 2026
Interactable is an intelligent, real-time lecture transcription and interactive learning platform. It listens to live audio, transcribes it in real-time, extracts key concepts, and generates interactive simulations and assessments to help students understand complex topics on the fly.
Challenges Attempted
Best Use of Gemini
- We are highly dependent on the Gemini API for our backend intelligence. We use it to:
- Extract key concepts real-time from the transcript.
- Generate study flashcards and quizzes.
- Generate code for interactive visualizations (HTML/JS/Canvas) that render directly in the frontend.
Best Use of MongoDB
- We use MongoDB Atlas for vector-embedded searchable classes. This allows us to perform semantic/fuzzy searches across lecture content, enabling students to find relevant information even without exact keyword matches.
Best Use of ElevenLabs
- We utilize ElevenLabs Scribe V2 model for our speech-to-text pipeline because of its superior performance in sound isolation and input accuracy, ensuring high-quality transcripts even in noisy classroom environments.
Inspiration
We were inspired by the theme of the hackathon, which states: "Whether it's how people organize their time, learn new skills, communicate, prepare for their careers, or solve common problems, we encourage you to build technology that improves familiar experiences..."
One of the most common problems students face is difficult-to-understand lectures. Whether it's a language barrier, a fast-paced speaker, or just a complex topic, it's easy to get lost. We wanted to build a tool that doesn't just record what was said, but actively helps you understand it.
What it Does
Interactable listens to your college lecture and creates interactive materials to help you learn concepts.
- Real-time Transcription: See the lecture text appear live.
- Concept Highlighting: Important terms are automatically highlighted.
- Instant Simulations: Click on a concept to generate an interactive simulation (e.g., a physics simulation of gravity, or a visualization of a sorting algorithm).
- Video Search: Automatically finds relevant YouTube videos to supplement the lecture material.
- Study Tools: Generates flashcards and quizzes from the lecture content.
How we Built it
- Frontend: Built with Next.js 14 (App Router), relying on WebSockets for real-time communication with the backend. We used Framer Motion for smooth UI transitions and Tailwind CSS for styling.
- Backend: A FastAPI server that manages the real-time transcription pipeline.
- AI Integration:
- Google Gemini processes text for concept extraction and simulation code generation.
- ElevenLabs handles high-fidelity audio processing.
- Database: MongoDB stores lecture data and vector embeddings for semantic search.
- Infrastructure: The entire stack is containerized with Docker for consistent deployment.
Challenges we ran into
- Real-time Latency: Balancing the need for accurate AI responses with the requirement for low-latency feedback in a live lecture setting was difficult. We had to optimize our WebSocket message handling and prompt engineering.
- Docker Networking: Ensuring successful communication between our Next.js frontend, FastAPI backend, and external APIs within a containerized environment required careful configuration of Docker Compose networks and volume mounts for hot-reloading.
- Hallucinations: preventing the LLM from generating broken code for simulations required robust system prompting and error handling.
Accomplishments that we're proud of
- Live Interactivity: Successfully generating working, interactive code visualizations from live speech in seconds.
- Seamless Integration: Tying together three powerful APIs (Gemini, ElevenLabs, MongoDB) into a cohesive, user-friendly experience.
- Polished UI: Creating a "wow" factor with a modern, high-contrast, and animated interface that feels like a premium product.
What we learned
- Prompt Engineering: We learned how significantly the structure of a system prompt affects the quality of generated code and strict JSON outputs.
- Full-Stack Streams: Managing data streams from audio input -> API -> Backend -> Frontend WebSocket -> React State gave us a deep appreciation for event-driven architectures.
What's next for Interactable
- Mobile App: Bringing the experience to iOS/Android for students on the go.
- Multi-user Sessions: Allowing an entire class to join a shared session where the professor's audio drives everyone's screen.
- LMS Integration: Directly syncing generated quizzes and summaries with Canvas or Blackboard.
Built With
- elevenlabs
- fastapi
- mongodb
- nextjs
Log in or sign up for Devpost to join the conversation.