Skip to content

Timothyng28/VideoGraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

107 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

VideoGraph

Think Visually, Explore Infinitely

Website HackPrinceton

An AI-powered educational platform that transforms curiosity into interactive, personalized learning journeys through dynamically generated video content.


๐Ÿ’ก Inspiration

VideoGraph reimagines how people learn complex topicsโ€”not through static textbooks or pre-recorded lectures, but as interactive journeys that adapt to each person's understanding in real-time. We wanted to empower learners to explore knowledge organically, giving them the freedom to ask questions, dive deeper, and have AI create personalized explanations exactly when they need them.

Our goal: educational videos that respond, adapt, and grow with you, creating a truly infinite learning experience that feels as natural as a conversation with a teacher.


๐ŸŒŸ Features

๐ŸŽฅ Dynamic Video Generation

  • Generate fully animated educational videos from simple text prompts
  • Manim-powered animations with graphs, vectors, molecules, and more
  • AI-generated voiceovers perfectly synchronized with visuals
  • Real-time rendering progress via Server-Sent Events

๐ŸŒณ Interactive Learning Tree

  • Each video becomes a node in your personal knowledge graph
  • Ask follow-up questions to create new branches
  • Navigate through your learning journey visually
  • Infinite explorationโ€”never hit a dead end

๐Ÿง  Adaptive Intelligence

  • Context-aware generation that remembers what you've learned
  • Difficulty adaptation based on your performance
  • Personalized remediation when you need extra help
  • Smart branching logic (questions โ†’ children, topics โ†’ siblings)

๐Ÿ” Semantic Search

  • Find concepts by meaning, not just keywords
  • Vector embedding-powered search (<50ms response time)
  • Search "how graphs work" and find "Introduction to Trees"

๐Ÿ“Š Interactive Quizzes

  • Contextual quizzes at the end of learning branches
  • Instant feedback and comprehension evaluation
  • Automatic remediation videos for incorrect answers

๐Ÿ’พ Cached Sessions

  • Explore pre-generated learning paths
  • Example topics: Pythagoras Theorem, Photosynthesis, and more
  • Visual thumbnails and AI-generated titles for easy navigation

๐Ÿ—๏ธ Tech Stack

Frontend

  • React + TypeScript + Vite
  • @xyflow/react for interactive tree visualization
  • TailwindCSS for styling
  • Remotion for video composition

Backend

  • FastAPI on Modal (serverless compute)
  • Python for orchestration and processing
  • Manim Community Edition for animation rendering
  • Google Cloud Storage for asset caching

AI & ML

  • Google Gemini for content generation and code synthesis
  • Cerebras for fast simple calls
  • Grok (xAI) for auxiliary tasks
  • Sentence Transformers for semantic embeddings
  • PyTorch for ML operations

Services

  • ElevenLabs for text-to-speech
  • Docker for containerization
  • Real-time streaming via SSE (Server-Sent Events)

๐Ÿš€ Getting Started

Prerequisites

  • Node.js (v18+)
  • Python (3.10+)
  • Modal account (modal.com)
  • API keys for:
    • Google Gemini
    • ElevenLabs
    • Google Cloud Storage

Quick Start

# Clone the repository
git clone https://github.com/yourusername/videograph.git
cd videograph

# Install frontend dependencies
cd frontend
npm install

# Set up environment variables
cp env.template .env
# Edit .env with your API keys

# Start the development server
npm run dev

# In a separate terminal, deploy the backend
cd ../backend/modal
modal deploy main_video_generator.py

For detailed setup instructions, see:


๐Ÿ“– How It Works

1. Ask a Question

Type any question or topic you're curious about.

2. AI Generation

  • Gemini creates a structured lesson plan
  • Generates custom Manim animation code
  • ElevenLabs synthesizes synchronized voiceover
  • Multiple sections rendered in parallel on Modal

3. Watch & Explore

  • View kinetic, video-essay style animations
  • See your learning tree grow in real-time
  • Click any node to revisit previous concepts

4. Branch Infinitely

  • Ask follow-up questions at any point
  • Take quizzes to test your understanding
  • Get personalized help when you need it

5. Search & Navigate

  • Use semantic search to find related concepts
  • Browse cached sessions for inspiration
  • Export or share your learning journey

๐Ÿ“ Project Structure

videograph/
โ”œโ”€โ”€ frontend/               # React + TypeScript UI
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ components/    # UI components (tree, overlays, etc.)
โ”‚   โ”‚   โ”œโ”€โ”€ controllers/   # Video controller logic
โ”‚   โ”‚   โ”œโ”€โ”€ services/      # API clients
โ”‚   โ”‚   โ””โ”€โ”€ remotion/      # Video rendering components
โ”‚   โ””โ”€โ”€ public/            # Static assets
โ”œโ”€โ”€ backend/
โ”‚   โ””โ”€โ”€ modal/             # FastAPI + Modal backend
โ”‚       โ”œโ”€โ”€ dev/           # Core logic modules
โ”‚       โ”‚   โ”œโ”€โ”€ api_logic.py
โ”‚       โ”‚   โ”œโ”€โ”€ generator_logic.py
โ”‚       โ”‚   โ””โ”€โ”€ reflection_logic.py
โ”‚       โ””โ”€โ”€ services/      # Shared services
โ”‚           โ”œโ”€โ”€ llm/       # LLM provider abstractions
โ”‚           โ”œโ”€โ”€ tts/       # Text-to-speech services
โ”‚           โ””โ”€โ”€ embeddings.py
โ”œโ”€โ”€ scripts/               # Utility scripts
โ””โ”€โ”€ docs/                  # Documentation

๐Ÿงฉ Key Features Explained

Infinite Branching System

Every user interaction is mapped to a tree structure:

  • Child nodes: Follow-up questions that dive deeper
  • Sibling nodes: Related topics at the same level
  • Persistent state: Tree survives page refreshes
  • Concurrent generation: Ask multiple questions while videos render

See CACHED_SESSIONS_GUIDE.md for details.

Code Healing Pipeline

AI-generated Manim code can fail unpredictably. Our healing system:

  1. Captures error traces and render failures
  2. Analyzes what went wrong (syntax, logic, assets)
  3. Repairs by sending context back to the LLM
  4. Retries up to N times with improved prompts
  5. Logs all failures for continuous improvement

See MULTI_VARIANT_RETRY_SYSTEM.md for implementation.

Voice Synthesis Pipeline

  • ElevenLabs generates natural-sounding narration
  • Timing alignment syncs voice with visual transitions
  • Dynamic voice selection based on content type
  • Streamed in real-time during rendering

See TTS_PIPELINE.md and DYNAMIC_VOICE_ID_IMPLEMENTATION.md.

Semantic Search

  • Sentence transformers convert text to 384D vectors
  • Cosine similarity finds conceptually related nodes
  • Embeddings cached for instant retrieval
  • Searches entire tree in <50ms

See SEMANTIC_SEARCH_IMPLEMENTATION.md.


๐Ÿ† Accomplishments

  • โœ… Unified Learning Experience: Integrated AI, physics simulations, and video synthesis into one cohesive app
  • โœ… True Infinite Learning: Users never hit a wallโ€”ask any question, get a video
  • โœ… Production-Ready AI Code: Robust validation and healing makes LLM-generated code viable
  • โœ… Sub-50ms Semantic Search: Vector embeddings enable intuitive concept discovery
  • โœ… Serverless at Scale: Modal handles parallel rendering and bursty workloads efficiently

๐Ÿšง Challenges We Solved

Dynamic Tree Stitching

Building a procedurally-generated, infinite tree required:

  • State machine in VideoController.tsx for atomic tree updates
  • Queue system for concurrent video generation
  • React Context for global state synchronization
  • Custom React Flow rendering for visual polish

Code Healing

Letting AI generate arbitrary Manim code meant handling:

  • Unpredictable compilation and runtime errors
  • Automated triage and stack trace analysis
  • Self-repair via LLM feedback loops (up to N retries)
  • Partial rendering for graceful degradation

Production-Grade Generation

Making AI content reliable at scale involved:

  • Structured output schemas (JSON) for predictable responses
  • Multi-model orchestration (Gemini, Cerebras, Grok)
  • Parallel rendering pipelines on Modal
  • Asset caching in Google Cloud Storage

๐Ÿ”ฎ Future Roadmap

Voice Interaction

  • ๐ŸŽค Speech-to-text input for hands-free learning
  • ๐Ÿ’ฌ Real-time dialogue with AI tutor
  • ๐Ÿ—ฃ๏ธ Voice cloning for personalized narration

Content Ecosystem

  • ๐Ÿ“š Public tree library for sharing knowledge graphs
  • ๐Ÿ“‹ Tree templates curated by experts
  • ๐Ÿ”— Export to PDF, Notion, or shareable links
  • ๐Ÿ”Œ Embedding API for third-party platforms

Enterprise & Education

  • ๐Ÿซ LMS integration (Canvas, Moodle, Blackboard)
  • ๐Ÿ“Š Analytics dashboard for teachers
  • ๐Ÿ“– Curriculum alignment to learning standards
  • ๐ŸŽ“ Bulk generation for entire courses

๐Ÿ“š Documentation

๐Ÿ™ Acknowledgments

  • Google Gemini for powerful content generation
  • Manim Community for animation framework
  • Modal for serverless infrastructure
  • ElevenLabs for natural voice synthesis
  • HackPrinceton for the opportunity to build and showcase

VideoGraph โ€” Every question deserves an answer. Every answer deserves a video. Every video, made just for you. โœจ

Website โ€ข Devpost โ€ข GitHub

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •