Skip to content

emw8105/hacktx-25

Repository files navigation

🎵 Lyria - AI-Powered Celestial Music Generator

An innovative web application that generates real-time AI music based on live video streams and space weather data using Gemini 2.0 Flash for visual analysis and Lyria RealTime for continuous music generation.

Next.js FastAPI Python TypeScript


🌟 Features

Video-Driven Music Generation

  • 🎬 Live Stream Support - Works with YouTube livestreams and VOD content
  • 🖼️ Frame Analysis - Captures and analyzes video frames every 15 seconds
  • 🎵 Continuous Music - Infinite streaming with smooth transitions
  • 🎨 Context-Aware - Music adapts to visual content in real-time
  • 💬 Interactive Queries - Users can steer music generation with text prompts

Space Weather Ambient Piano

  • 🌌 Real-Time Space Data - Fetches live solar wind, geomagnetic activity, and X-ray flux from NOAA
  • 🎹 Ambient Piano - Solo piano compositions that reflect cosmic conditions
  • 🪐 Homepage Integration - Perfect background for Voyager Golden Record visual
  • Dynamic Updates - Music adjusts every 30 seconds based on space weather changes

DJ Controls

  • 🎛️ Real-Time Parameters - Adjust tempo, energy, mood, intensity
  • 🎼 Genre Selection - Multiple genre presets
  • 🔊 Volume Control - Independent volume and mute controls
  • ⏯️ Playback Control - Play, pause, stop functionality

Audio Management

  • 📥 Download Sessions - Download current or completed session audio as WAV files
  • 💾 Automatic Recording - Server saves all sessions automatically
  • 🎧 High Quality - 48kHz stereo PCM audio format
  • 📁 Session Management - Timestamped files for easy organization

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                       Frontend (Next.js)                     │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐            │
│  │   Home     │  │  Streams   │  │   Upload   │            │
│  │  (Space    │  │  (Video    │  │  (Custom   │            │
│  │  Weather)  │  │  Streams)  │  │   Video)   │            │
│  └────────────┘  └────────────┘  └────────────┘            │
└────────────────────────┬────────────────────────────────────┘
                         │ WebSocket (Binary Audio + JSON)
┌────────────────────────┴────────────────────────────────────┐
│                    Backend (FastAPI)                         │
│  ┌──────────────────────────────────────────────────────┐   │
│  │          WebSocket Session Driver                     │   │
│  │  • Frame Capture (yt-dlp + ffmpeg)                   │   │
│  │  • Audio Buffering & Download                        │   │
│  └────────────┬─────────────────────────┬────────────────┘   │
│               │                         │                    │
│   ┌───────────▼──────────┐  ┌──────────▼────────────┐      │
│   │   Gemini 2.0 Flash   │  │   Lyria RealTime      │      │
│   │   (Image Analysis)   │  │   (Music Generation)  │      │
│   └──────────────────────┘  └───────────────────────┘      │
└─────────────────────────────────────────────────────────────┘

Data Flow

Video Mode:

YouTube/Video → Frame Capture (15s) → Gemini Analysis → Music Prompt
                                                              ↓
Frontend ← Binary Audio Stream (48kHz stereo) ← Lyria RealTime

Space Weather Mode:

NOAA APIs → Space Weather Data (30s) → Music Prompt Generation
                                              ↓
Frontend ← Binary Audio Stream (48kHz stereo) ← Lyria RealTime

🚀 Quick Start

Prerequisites

Required:

  • Node.js 18+ and npm
  • Python 3.8+
  • ffmpeg (for video frame capture)
  • Google AI API key (for Gemini & Lyria)

Optional:

  • Git (for version control)
  • yt-dlp (automatically installed via pip)

Installation

1. Clone the Repository

git clone https://github.com/emw8105/hacktx-25.git
cd hacktx-25

2. Backend Setup

cd server

# Create virtual environment (recommended)
python -m venv .venv

# Activate virtual environment
# Windows PowerShell:
.\.venv\Scripts\Activate.ps1
# Windows CMD:
.\.venv\Scripts\activate.bat
# Linux/Mac:
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Install ffmpeg (if not already installed)
# Windows (using Chocolatey):
choco install ffmpeg
# Or download from: https://ffmpeg.org/download.html

# Mac:
brew install ffmpeg
# Linux:
sudo apt-get install ffmpeg

3. Frontend Setup

cd ..  # Back to root
npm install

4. Environment Configuration

Create a .env file in the root directory with the following:

# REQUIRED: Google AI API Key
# Get your key from: https://aistudio.google.com/apikey
GEMINI_API_KEY=your_gemini_api_key_here

# REQUIRED: Lyria API Key (for Live Music API)
LYRIA_API_KEY=your_lyria_api_key_here

# REQUIRED: Google Cloud Project ID
GCP_PROJECT_ID=your_gcp_project_id_here

# OPTIONAL: Custom Lyria WebSocket URL (defaults to official endpoint)
LYRIA_WS_URL=wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1alpha.GenerativeService.BidiGenerateMusic

# OPTIONAL: Gemini Model (defaults to gemini-2.0-flash)
GEMINI_MODEL=gemini-2.0-flash

⚠️ Minimum Required Values:

GEMINI_API_KEY=your_key_here
LYRIA_API_KEY=your_key_here
GCP_PROJECT_ID=your_project_id_here

How to get your API keys:

  1. GEMINI_API_KEY: Visit Google AI Studio
  2. LYRIA_API_KEY: Obtain from Google's Live Music API access program
  3. GCP_PROJECT_ID: Your Google Cloud Project ID (create at Google Cloud Console)

5. Run the Application

Terminal 1 - Backend:

cd server
python main.py
# Server starts on http://localhost:8000

Terminal 2 - Frontend:

npm run dev
# App starts on http://localhost:3000

6. Open in Browser

Navigate to http://localhost:3000


📖 Usage Guide

1. Homepage (Space Weather Mode)

  • Background music automatically plays based on real-time space weather
  • Features the Voyager Golden Record rotating visual
  • Music updates every 30 seconds with cosmic conditions

2. Streams Page

  • Browse available YouTube livestreams and videos
  • Click any stream to start music generation
  • Add custom YouTube URLs via "Create Stream" button

3. Upload Page

  • Upload your own video files (MP4, MOV, AVI)
  • Drag-and-drop or browse to select
  • Instantly start music generation

4. Stream Page (Main Experience)

  • Video Display: Embedded YouTube player or video snapshot
  • Status Indicators: Connection status, listener count
  • Music Controls:
    • Play/Pause: Start or stop audio playback
    • Mute/Unmute: Toggle audio on/off
    • Download: Save current session as WAV file
  • DJ Controls: Adjust tempo, energy, mood, genre, and more
  • Query Input: Type prompts to steer music generation
    • Example: "Make it more energetic"
    • Example: "Add orchestral elements"

5. Downloading Audio

  • Click the Download button during or after a session
  • Downloads the current session's audio as a timestamped WAV file
  • Format: music_20251019_143052_session-abc123.wav
  • High quality: 48kHz stereo PCM

🛠️ API Reference

REST Endpoints

GET /videos

Returns list of available streams

[
  {
    "id": "iss",
    "name": "ISS Livestream",
    "source": "https://youtube.com/...",
    "type": "stream"
  }
]

POST /videos/custom

Register a custom YouTube URL

{
  "name": "My Custom Stream",
  "url": "https://youtube.com/watch?v=..."
}

POST /upload/file

Upload a video file

  • Form data: name (string), file (video file)
  • Returns: {id, filename, name, bytes}

GET /audio/current/{session_id}

Download audio from active or completed session

  • Returns: WAV file with timestamped filename

GET /audio/latest

Download most recent audio session

  • Returns: Latest WAV file

GET /audio/list

List all saved audio files

{
  "files": [
    {
      "filename": "music_20251019_143052_session-abc.wav",
      "size": 12345678,
      "created": "2025-10-19T14:30:52",
      "modified": "2025-10-19T14:31:52"
    }
  ]
}

WebSocket Endpoints

WS /ws/session/{session_id}

Main video-driven music generation endpoint

Client → Server Messages:

// Start session
{
  "type": "start",
  "source": "stream" | "video" | "upload",
  "id": "stream_id",
  "prompt": "Optional music description"
}

// Send query to steer music
{
  "type": "query",
  "text": "Make it more energetic"
}

// Update music config
{
  "type": "set_config",
  "bpm": 120,
  "temperature": 0.8
}

// Playback control
{
  "type": "control",
  "action": "play" | "pause" | "stop"
}

// DJ parameters
{
  "type": "dj_parameters",
  "parameters": {
    "tempo": 120,
    "energy": 0.8,
    "mood": "upbeat",
    "intensity": 0.7,
    "genre": "electronic",
    "volume": 0.9
  }
}

// Keep-alive
{"type": "ping"}

Server → Client Messages:

// Binary frames: Raw PCM audio (48kHz, stereo, 16-bit)
// Each chunk ~384KB (2 seconds of audio)

// JSON status messages:
{"status": "session_started", "session_id": "..."}
{"status": "lyria_stream_started", "mode": "lyria-realtime-ws"}
{"type": "lyria_playback_started"}
{"status": "snapshot_processed", "prompt": "..."}
{"type": "lyria_prompt_applied"}
{"status": "heartbeat"}
{"type": "pong"}
{"error": "Error message if something fails"}

🎨 Frontend Structure

app/
├── page.tsx                    # Homepage (space weather)
├── streams/page.tsx            # Browse streams
├── showcase/page.tsx           # Custom YouTube input
├── upload/page.tsx             # Video upload
├── stream/[id]/page.tsx        # Main stream player
│
components/
├── navigation.tsx              # Top navigation bar
├── dj-controls.tsx             # DJ parameter controls
├── ui/                         # Shadcn UI components
│
lib/
├── api.ts                      # API client functions
│
public/
├── lyria_header.mp4           # Background video
└── voyager.mp4                # Golden Record visual

Key Frontend Features

React Optimizations:

  • useCallback for all event handlers (prevents re-renders)
  • useRef for stable references (WebSocket, Audio nodes)
  • Conditional state updates (only update if value changes)
  • Memoized input handlers

Audio Processing:

  • Web Audio API for playback
  • GainNode for volume control
  • AudioBuffer queue for smooth streaming
  • Real-time PCM decoding

State Management:

  • Minimal state updates (only UI-critical values)
  • Audio data stored in refs (not state)
  • Session ID tracking for downloads

🖥️ Backend Structure

server/
├── main.py                     # FastAPI application
├── gemini.py                   # Gemini & Lyria integration
├── space_weather.py            # NOAA space weather API
├── utils.py                    # Video capture utilities
├── requirements.txt            # Python dependencies
│
├── test_socket.py              # Video mode test client
├── test_space_weather.py       # Space weather test client
├── continuous_test_client.py   # Real-time playback test
├── continuous_recorder.py      # Continuous recording test
│
├── uploads/                    # Uploaded video files
├── audio_sessions/             # Recorded audio sessions
└── snapshots/                  # Captured video frames

Key Backend Features

WebSocket Session Management:

  • Binary audio streaming (48kHz stereo PCM)
  • Audio buffering for downloads
  • Automatic WAV file creation on disconnect
  • Concurrent session support

Video Processing:

  • Frame capture with ffmpeg
  • yt-dlp for YouTube stream resolution
  • HLS/DASH stream support
  • Local video file support

Music Generation:

  • Lyria RealTime WebSocket protocol
  • Weighted prompt system
  • Configuration updates (BPM, temperature)
  • Playback control (play, pause, stop)

🧪 Testing

Test the Backend

cd server

# Test video-driven music (45 seconds)
python test_socket.py

# Test space weather piano (60 seconds)
python test_space_weather.py

# Test with custom duration (90 seconds)
set AUDIO_DURATION=90  # Windows CMD
$env:AUDIO_DURATION = "90"  # PowerShell
export AUDIO_DURATION=90  # Linux/Mac
python test_socket.py

# Real-time playback test
python continuous_test_client.py
# Press Ctrl+C to stop

# Continuous recording with segments
python continuous_recorder.py
# Press Ctrl+C to stop

Output Files:

  • audio/music_YYYYMMDD_HHMMSS_sessionid.wav
  • audio/space_weather_YYYYMMDD_HHMMSS_sessionid.wav
  • audio/music_YYYYMMDD_HHMMSS_seg001.wav (segments)

Verify Setup

# Check API key is set
echo $env:GEMINI_API_KEY  # PowerShell
echo $GEMINI_API_KEY  # Linux/Mac

# Check ffmpeg is installed
ffmpeg -version

# Check backend is running
curl http://127.0.0.1:8000/videos

# Check frontend is running
curl http://localhost:3000

🐛 Troubleshooting

Common Issues

1. "WebSocket disconnected immediately"

  • Check if GEMINI_API_KEY is set correctly
  • Verify LYRIA_API_KEY has proper permissions
  • Check terminal for error messages

2. "No audio playing"

  • Open browser console (F12) and check for errors
  • Verify WebSocket connection status
  • Check audio context state (click Debug button)

3. "Failed to capture frame"

  • Ensure ffmpeg is installed and in PATH
  • Check YouTube URL is accessible
  • Try a different video source

4. "Module not found" errors

  • Backend: pip install -r requirements.txt
  • Frontend: npm install
  • Activate virtual environment for Python

5. "Audio download returns 404"

  • Wait a few seconds for audio to buffer
  • Check that session has started playing
  • Verify session_id is being tracked

6. "Port already in use"

  • Backend (8000): Change port in main.py
  • Frontend (3000): Use PORT=3001 npm run dev

Enable Debug Mode

Frontend:

  • Open browser console (F12)
  • Click the "Debug" button to see audio state
  • Check Network tab for WebSocket messages

Backend:

  • Check terminal for detailed logs
  • All WebSocket messages are printed
  • Frame capture progress shown

📦 Dependencies

Backend (Python)

fastapi>=0.104.0
uvicorn>=0.24.0
websockets>=12.0
google-generativeai>=0.3.0
python-dotenv>=1.0.0
yt-dlp>=2023.10.13
aiohttp>=3.9.0  # For space weather

Frontend (Node.js)

next@14+
react@18+
typescript@5+
tailwindcss@3+
lucide-react  # Icons
shadcn/ui  # UI components

🎯 Performance Optimization

Frontend Optimizations

  • ✅ All event handlers wrapped in useCallback
  • ✅ Audio data stored in refs (not state)
  • ✅ Conditional state updates
  • ✅ Memoized input handlers
  • ✅ No re-renders during playback

Backend Optimizations

  • ✅ Async frame capture
  • ✅ WebSocket audio buffering
  • ✅ Concurrent session support
  • ✅ Background file saving
  • ✅ Efficient memory management

Audio Quality

  • Sample Rate: 48000 Hz
  • Channels: 2 (stereo)
  • Bit Depth: 16-bit
  • Format: PCM (little-endian)
  • Bandwidth: ~192 KB/s per session

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License.


🙏 Acknowledgments

  • Google AI Studio - For Gemini 2.0 Flash and Lyria RealTime APIs
  • NOAA - For free space weather data APIs
  • Voyager Golden Record - Inspiration for the space theme
  • HackTX 2025 - For providing the opportunity to build this project

📞 Support

For issues or questions:

  • Open an issue on GitHub
  • Check existing issues for solutions
  • Review the troubleshooting section above

🚧 Roadmap

  • User authentication and session persistence
  • Playlist creation and management
  • Social sharing of generated music
  • Mobile app (React Native)
  • Additional music generation models
  • Cloud deployment guide
  • Docker containerization

Built with ❤️ for HackTX 2025

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors