A Slack integration that monitors channels for YouTube URLs, extracts and summarizes video transcripts using OpenAI, and stores them in a Weaviate vector database for future reference and querying. Specifically designed for generative AI podcasts and interviews with founders, operators, and researchers to extract technical insights.
YouTube Transcript Bot processes YouTube videos shared in Slack. When a team member posts a YouTube link in a monitored channel, the bot:
- Extracts the video transcript
- Generates a concise summary using OpenAI
- Delivers the summary as a direct message to the specified user
- Stores the summary in a Weaviate vector database for future semantic searching
- Slack Channel Monitoring: Watches specified Slack channels for YouTube URLs
- Transcript Extraction: Pulls transcripts from YouTube videos
- AI-Powered Summarization: Uses OpenAI to create summaries
- Private Delivery: Sends summaries as direct messages
- Vector Database Storage: Stores summaries in Weaviate for semantic searching
- Query Functionality: Supports natural language queries with
!querycommand
The application uses:
- Arcade.dev for seamless Slack integration without requiring a custom Slack app
- YouTube Transcript API for extracting video transcripts
- OpenAI API for generating summaries
- Weaviate serverless cluster for vector database storage
- Weaviate Query Agent for efficient natural language querying of stored summaries
The bot leverages Weaviate's Query Agent to provide powerful natural language querying capabilities. This pre-built agentic service abstracts away the complexities of vector search, allowing users to query stored summaries using simple natural language.
When a user issues a !query command, the Weaviate Query Agent analyzes the query, determines the appropriate search strategy, and returns relevant summaries that best match the semantic intent.
The bot uses Arcade.dev to seamlessly integrate with Slack without requiring a custom Slack app. This integration provides secure user impersonation, independent scaling, and enterprise-ready security while significantly simplifying the development process.
- Python 3.8 or higher
- An Arcade.dev account with API key
- An OpenAI API key
- A Weaviate Cloud serverless cluster (free tier available)
- A Slack workspace where you have permissions to add apps
-
Clone the repository
git clone https://github.com/yourusername/youtube-transcript-bot.git cd youtube-transcript-bot -
Install required packages
pip install -r requirements.txt
-
Set up environment variables Create a
.envfile in the project root with the following variables:# API Keys ARCADE_API_KEY=your_arcade_api_key OPENAI_API_KEY=your_openai_api_key WEAVIATE_URL=your_weaviate_cluster_url WEAVIATE_API_KEY=your_weaviate_api_key # User Configuration USER_ID=your_email@example.com CHANNEL_NAME=youtube TARGET_USERNAME=username_to_receive_summaries -
Run the application
python main.py
On first run, you'll need to authorize the application with Slack through the Arcade.dev API.
- Start the bot using
python main.py - Share YouTube links in the configured Slack channel
- The bot will process videos and send summaries as direct messages
- Query past summaries with
!query [your question]in the Slack channel
Example: !query What are the best practices for React performance?
If the bot isn't working as expected:
- Ensure you have the latest version of Arcade tools: tool_name="Slack.GetMessagesInChannelByName@[latest version] - Slack.GetChannelMetadataByName@[latest version], Slack.SendDmToUser@[latest version]
- Ensure all API keys in your
.envfile are correct - Verify the YouTube link has an available transcript
- Check that the channel name matches exactly (case-sensitive)
- Make sure your Weaviate cluster is running
MIT Licensed