VR Tutorial AI Assistant

Inspiration

Education should be accessible to everyone, anywhere, at any time. Traditional learning platforms are limited to flat screens and passive consumption. We envisioned a future where learners could simply ask for knowledge and have it materially around them in immersive 3D space. The Meta Quest's capabilities—voice recognition, spatial computing, and immersive displays—made this vision possible. We wanted to democratize learning by combining AI's ability to generate personalized content with VR's power to create engaging, distraction-free learning environments.

What it does

LearnSpace VR is an AI-powered voice-activated tutorial assistant that lives in your Meta Quest headset. Users simply speak their question or topic (e.g., "explain what is IoT" or "teach me Python basics"), and within 2-3 minutes, our AI generates a custom video tutorial complete with narration, visuals, and structured lessons.

The experience includes:

Voice-first interaction: Natural speech recognition—no typing in VR
Inline video playback: Watch tutorials directly on the main screen or open floating cinema mode
Spatial UI: Chat interface and video players positioned ergonomically in 3D space
Grabbable elements: Move screens and videos to your preferred viewing position
AI chat assistant: Conversational interface that guides you through your learning journey
Real-time generation feedback: See progress as your custom tutorial is being created
Programmatically generated animations: High-quality mathematical and technical visualizations

Whether you're learning programming, science, history, or any topic, LearnSpace VR adapts to your needs and delivers content in the most immersive format possible.

How we built it

Tech Stack:

Immersive Web SDK (IWSDK): Built the entire VR experience using Meta's latest web framework
Claude API (Anthropic): Powers the conversational AI assistant and structures tutorial content
Manim (Mathematical Animation Engine): Generates programmatic educational animations with code-driven visuals
Custom Video Generation Pipeline: Backend service that orchestrates AI content creation, Manim rendering, and text-to-speech narration
Web Speech API: Enables natural voice input in VR
Three.js/WebXR: 3D rendering and spatial interactions
Canvas API: Dynamic UI rendering for chat and video thumbnails

Architecture:

Voice input captured via Web Speech Recognition API
User query sent to Claude API to structure tutorial content and generate Manim animation scripts
Manim renders mathematical/technical animations programmatically
Text-to-speech generates professional narration synchronized with visuals
Video pipeline combines animations and audio into final tutorial
Delivered back to VR environment with spatial playback controls

Why Manim? We chose Manim (the same library used by 3Blue1Brown) because it creates mathematically precise, programmatically generated animations that are perfect for educational content. Unlike pre-recorded stock footage, Manim allows us to generate custom diagrams, equations, graphs, and technical visualizations on-demand for any topic—from calculus to computer networks to physics simulations.

Development Process: We started with the core interaction loop—voice input → AI processing → video display. The integration of Manim was crucial for quality: instead of relying on static slides or screen recordings, we generate animated content that rivals professional educational channels. The biggest technical hurdle was orchestrating the entire pipeline (AI structuring → Manim scripting → rendering → narration → VR delivery) to complete within 2-3 minutes while maintaining high visual quality.

Challenges we ran into

Manim Rendering Performance: Manim produces stunning animations but can be render-intensive. We optimized rendering settings, implemented efficient scene management, and parallelized operations to keep generation time under 3 minutes while maintaining 1080p quality.

AI-to-Manim Script Generation: Converting natural language topics into valid Manim Python code required careful prompt engineering. Claude API needed to understand both the educational content AND how to express it as Manim scenes with proper syntax, timing, and visual hierarchy.

Video Playback in WebXR: Getting HTML5 video elements to render properly in VR was complex. Browser autoplay policies, CORS restrictions, and texture updates required careful handling. We solved this by implementing a frame-copying system that continuously draws video frames to canvas textures.

Speech Recognition Reliability: Voice input in VR is challenging—microphone permissions, ambient noise, and recognition accuracy all posed problems. We added visual feedback, retry mechanisms, and a fallback text input system.

Synchronizing Narration with Animations: Timing text-to-speech narration to match Manim scene transitions required precise coordination. We developed a timing system that analyzes Manim scene durations and generates appropriately paced narration.

Memory Management: Multiple video elements and textures could crash the browser. We implemented proper cleanup, video element pooling, and texture disposal to maintain performance.

3D Spatial UI Design: Positioning interactive elements in 3D space for optimal ergonomics and visibility required extensive testing and iteration.

Accomplishments that we're proud of

✅ Seamless voice interaction that feels natural and intuitive in VR ✅ Real AI-powered content generation that creates unique tutorials on any topic ✅ Professional-quality Manim animations generated programmatically on-demand ✅ Complete video production pipeline (AI → animation → narration → delivery) in under 3 minutes ✅ Smooth video playback integrated into the VR environment without performance issues ✅ Spatial UI design that leverages Meta Quest's 3D capabilities ✅ Robust error handling that gracefully manages API failures and rendering issues ✅ Built entirely with IWSDK, showcasing the power of web-based VR development ✅ End-to-end working prototype in just 5 weeks

The combination of AI structuring and Manim rendering means our tutorials rival professionally produced educational content—but they're generated on-demand for whatever the user wants to learn.

What we learned

Educational Animation at Scale: Mastered programmatic content generation using Manim, learning how to create clear, engaging mathematical and technical visualizations through code.

WebXR Development: Deepened our understanding of immersive web technologies and the challenges of building production-ready VR experiences in the browser.

AI Pipeline Orchestration: Learned how to chain multiple AI services (Claude for content structuring, Manim for visualization, TTS for narration) into a cohesive system that delivers high-quality results quickly.

VR UX Design Principles: Discovered the importance of spatial positioning, visual feedback, and ergonomic considerations in VR interfaces.

Asynchronous Workflows: Mastered handling long-running operations (video rendering) in VR while keeping users engaged and informed.

Performance Optimization: Gained expertise in managing resources (textures, videos, audio) in resource-constrained VR environments while maintaining 60+ FPS.

What's next for LearnSpace VR

Near-term improvements:

Hand tracking integration for controller-free interactions
Passthrough mode to blend learning content with real-world space
Enhanced Manim templates for more subject areas (chemistry, biology, programming, etc.)
Interactive animations where users can manipulate Manim-generated objects in VR space
Multi-language support for global accessibility
Progress tracking & achievements to gamify learning
Social features allowing friends to learn together
Tutorial library with personalized recommendations

Long-term vision:

Live Manim editing in VR: Let users modify animation parameters and see results in real-time
3D Manim scenes: Extend beyond 2D animations to full 3D mathematical visualizations
Real-time Q&A during video playback with dynamic content generation
Collaborative learning spaces where multiple users can learn together
Offline mode with pre-generated popular tutorials
Integration with educational platforms (Khan Academy, Coursera, etc.)
AI tutor personality customization
Assessment & quizzes to reinforce learning with interactive Manim problems

LearnSpace VR demonstrates that combining cutting-edge AI, professional animation tools, and immersive VR creates a new paradigm for education—one where high-quality, personalized learning experiences are available to anyone, on any topic, instantly.

Built With

apis
canvas
cloud-services
databases
frameworks
html5
immersion
immersive
immersivewebsdk
iwsdk
javascript
manim
node.js
platforms
python
three.js
webgl
webxr

Updates

Nkugwa Mark William started this project — Dec 09, 2025 12:07 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.