Inspiration

Education should be accessible to everyone, anywhere, at any time. Traditional learning platforms are limited to flat screens and passive consumption. We envisioned a future where learners could simply ask for knowledge and have it materially around them in immersive 3D space. The Meta Quest's capabilities—voice recognition, spatial computing, and immersive displays—made this vision possible. We wanted to democratize learning by combining AI's ability to generate personalized content with VR's power to create engaging, distraction-free learning environments.

What it does

LearnSpace VR is an AI-powered voice-activated tutorial assistant that lives in your Meta Quest headset. Users simply speak their question or topic (e.g., "explain what is IoT" or "teach me Python basics"), and within 2-3 minutes, our AI generates a custom video tutorial complete with narration, visuals, and structured lessons.

The experience includes:

  • Voice-first interaction: Natural speech recognition—no typing in VR
  • Inline video playback: Watch tutorials directly on the main screen or open floating cinema mode
  • Spatial UI: Chat interface and video players positioned ergonomically in 3D space
  • Grabbable elements: Move screens and videos to your preferred viewing position
  • AI chat assistant: Conversational interface that guides you through your learning journey
  • Real-time generation feedback: See progress as your custom tutorial is being created
  • Programmatically generated animations: High-quality mathematical and technical visualizations

Whether you're learning programming, science, history, or any topic, LearnSpace VR adapts to your needs and delivers content in the most immersive format possible.

How we built it

Tech Stack:

  • Immersive Web SDK (IWSDK): Built the entire VR experience using Meta's latest web framework
  • Claude API (Anthropic): Powers the conversational AI assistant and structures tutorial content
  • Manim (Mathematical Animation Engine): Generates programmatic educational animations with code-driven visuals
  • Custom Video Generation Pipeline: Backend service that orchestrates AI content creation, Manim rendering, and text-to-speech narration
  • Web Speech API: Enables natural voice input in VR
  • Three.js/WebXR: 3D rendering and spatial interactions
  • Canvas API: Dynamic UI rendering for chat and video thumbnails

Architecture:

  1. Voice input captured via Web Speech Recognition API
  2. User query sent to Claude API to structure tutorial content and generate Manim animation scripts
  3. Manim renders mathematical/technical animations programmatically
  4. Text-to-speech generates professional narration synchronized with visuals
  5. Video pipeline combines animations and audio into final tutorial
  6. Delivered back to VR environment with spatial playback controls

Why Manim? We chose Manim (the same library used by 3Blue1Brown) because it creates mathematically precise, programmatically generated animations that are perfect for educational content. Unlike pre-recorded stock footage, Manim allows us to generate custom diagrams, equations, graphs, and technical visualizations on-demand for any topic—from calculus to computer networks to physics simulations.

Development Process: We started with the core interaction loop—voice input → AI processing → video display. The integration of Manim was crucial for quality: instead of relying on static slides or screen recordings, we generate animated content that rivals professional educational channels. The biggest technical hurdle was orchestrating the entire pipeline (AI structuring → Manim scripting → rendering → narration → VR delivery) to complete within 2-3 minutes while maintaining high visual quality.

Challenges we ran into

Manim Rendering Performance: Manim produces stunning animations but can be render-intensive. We optimized rendering settings, implemented efficient scene management, and parallelized operations to keep generation time under 3 minutes while maintaining 1080p quality.

AI-to-Manim Script Generation: Converting natural language topics into valid Manim Python code required careful prompt engineering. Claude API needed to understand both the educational content AND how to express it as Manim scenes with proper syntax, timing, and visual hierarchy.

Video Playback in WebXR: Getting HTML5 video elements to render properly in VR was complex. Browser autoplay policies, CORS restrictions, and texture updates required careful handling. We solved this by implementing a frame-copying system that continuously draws video frames to canvas textures.

Speech Recognition Reliability: Voice input in VR is challenging—microphone permissions, ambient noise, and recognition accuracy all posed problems. We added visual feedback, retry mechanisms, and a fallback text input system.

Synchronizing Narration with Animations: Timing text-to-speech narration to match Manim scene transitions required precise coordination. We developed a timing system that analyzes Manim scene durations and generates appropriately paced narration.

Memory Management: Multiple video elements and textures could crash the browser. We implemented proper cleanup, video element pooling, and texture disposal to maintain performance.

3D Spatial UI Design: Positioning interactive elements in 3D space for optimal ergonomics and visibility required extensive testing and iteration.

Accomplishments that we're proud of

Seamless voice interaction that feels natural and intuitive in VR ✅ Real AI-powered content generation that creates unique tutorials on any topic ✅ Professional-quality Manim animations generated programmatically on-demand ✅ Complete video production pipeline (AI → animation → narration → delivery) in under 3 minutes ✅ Smooth video playback integrated into the VR environment without performance issues ✅ Spatial UI design that leverages Meta Quest's 3D capabilities ✅ Robust error handling that gracefully manages API failures and rendering issues ✅ Built entirely with IWSDK, showcasing the power of web-based VR development ✅ End-to-end working prototype in just 5 weeks

The combination of AI structuring and Manim rendering means our tutorials rival professionally produced educational content—but they're generated on-demand for whatever the user wants to learn.

What we learned

Educational Animation at Scale: Mastered programmatic content generation using Manim, learning how to create clear, engaging mathematical and technical visualizations through code.

WebXR Development: Deepened our understanding of immersive web technologies and the challenges of building production-ready VR experiences in the browser.

AI Pipeline Orchestration: Learned how to chain multiple AI services (Claude for content structuring, Manim for visualization, TTS for narration) into a cohesive system that delivers high-quality results quickly.

VR UX Design Principles: Discovered the importance of spatial positioning, visual feedback, and ergonomic considerations in VR interfaces.

Asynchronous Workflows: Mastered handling long-running operations (video rendering) in VR while keeping users engaged and informed.

Performance Optimization: Gained expertise in managing resources (textures, videos, audio) in resource-constrained VR environments while maintaining 60+ FPS.

What's next for LearnSpace VR

Near-term improvements:

  • Hand tracking integration for controller-free interactions
  • Passthrough mode to blend learning content with real-world space
  • Enhanced Manim templates for more subject areas (chemistry, biology, programming, etc.)
  • Interactive animations where users can manipulate Manim-generated objects in VR space
  • Multi-language support for global accessibility
  • Progress tracking & achievements to gamify learning
  • Social features allowing friends to learn together
  • Tutorial library with personalized recommendations

Long-term vision:

  • Live Manim editing in VR: Let users modify animation parameters and see results in real-time
  • 3D Manim scenes: Extend beyond 2D animations to full 3D mathematical visualizations
  • Real-time Q&A during video playback with dynamic content generation
  • Collaborative learning spaces where multiple users can learn together
  • Offline mode with pre-generated popular tutorials
  • Integration with educational platforms (Khan Academy, Coursera, etc.)
  • AI tutor personality customization
  • Assessment & quizzes to reinforce learning with interactive Manim problems

LearnSpace VR demonstrates that combining cutting-edge AI, professional animation tools, and immersive VR creates a new paradigm for education—one where high-quality, personalized learning experiences are available to anyone, on any topic, instantly.

Built With

Share this project:

Updates