Inspiration
In recent years, content creation has shifted from simply uploading videos or images online to live streaming, which is growing exponentially. Streamers today often have entire teams trying to decode platform algorithms to understand trends, grow their audiences, and increase engagement.
Our team was inspired by this shift and the lack of objective tools to measure game popularity and community sentiment. We wanted to provide streamers, game developers, and agencies a data-driven way to evaluate trending games and user reviews across forums, social media, and gaming communities.
What it does
HypeHunter is a real-time esports analytics platform that:
- Tracks game popularity across multiple sources including forums, Reddit, social media, and gaming databases like Steam.
- Processes community discussions to generate objective review scores for games.
- Provides actionable insights and visualizations through a dashboard, showing trending games and detailed reviews.
- Helps streamers, developers, and agencies make informed decisions on what games to play, promote, or develop.
How we built it
We developed a web application with a modern dashboard interface. The system workflow includes:
- Data ingestion: Collecting text and metadata from multiple sources and formats.
- NLP pipeline: Cleaning, processing, and analyzing text data to extract sentiment and key discussion topics.
- AI & LLM predictions: Using machine learning and large language models to predict game popularity and generate review scores.
- Visualization: Displaying trending games per day and their community sentiment through interactive charts and detailed game pages.
Technologies used include: React.js for the frontend, Python for NLP processing, and various APIs and databases to aggregate game-related data.
How HypeHunter Works
HypeHunter is designed to identify trending games and provide insights to gamers, streamers, and developers. Here’s a breakdown of our data collection and processing workflow:
Data Collection We gather raw data from multiple sources using various APIs:
RAWG API – For detailed game metadata.
Twitter API – To track social buzz around games.
Reddit API – To monitor discussions in popular gaming subreddits.
Twitch API – To get live viewer statistics.
Steam API – For player metrics and game stats.
Trending Game Identification Our intelligent agent scans subreddit posts and comments to generate a list of potentially trending games.
Data Aggregation These candidate games are queried across all APIs to fetch relevant information.
Information Processing Using a large language model (LLM) "gemnai flash 2.0", we process the collected data to generate a structured JSON containing detailed info about each game.
Rating System A second LLM evaluates each game and assigns ratings based on multiple factors:
Rating Logic:
Rating:
4 → Score 5
2 < rating ≤ 4 → Score 3
≤ 2 → Score 1
Playtime:
100 hours → Score 5
50 < playtime ≤ 100 hours → Score 3
≤ 50 hours → Score 1
YouTube Views:
1M → Score 5
100K < views ≤ 1M → Score 3
≤ 100K → Score 1
Twitch Viewer Count:
1000 → Score 5
100 < viewers ≤ 999 → Score 3
≤ 100 → Score 1
Metacritic Score:
80 → Score 5
60 < score ≤ 80 → Score 3
≤ 60 → Score 1
Database Storage The original JSON data and the calculated ratings are stored in a MongoDB database, allowing our frontend to fetch and dynamically update the UI with trending game information.
Challenges we ran into
- Data diversity: Handling different data formats and sources, from Reddit posts to Steam reviews.
- Real-time processing: Designing the system to provide near-instant predictions without lag.
- Sentiment accuracy: Ensuring NLP models correctly interpreted slang, sarcasm, and gaming-specific language.
- Dashboard usability: Presenting complex insights in a simple, actionable format for users.
Accomplishments that we're proud of
- Successfully built a real-time trend detection system for esports games.
- Developed an objective review scoring system derived from community discussions.
- Created a user-friendly dashboard that delivers actionable insights to streamers, developers, and agencies.
- Integrated multiple data sources into a single pipeline efficiently.
What we learned
- How to aggregate and process large-scale text data from diverse sources.
- The power and limitations of NLP and LLMs for understanding online sentiment.
- Best practices for building dashboards that translate complex data into actionable insights.
- Collaboration and problem-solving under tight hackathon deadlines.
What's next for HypeHunter
- Adding predictive analytics to forecast future game popularity trends.
- Expanding to more data sources, including Twitch, Discord, and YouTube gaming streams.
- Improving accuracy of sentiment analysis by fine-tuning models for gaming language.
- Offering personalized recommendations for streamers and developers based on audience behavior.
Built With
- fastapi
- google-ai-studio-api
- kafka
- langchain
- mongodb
- python
- rawg-api
- react
- steam-api
- twitch
- twitter-ads-api
- websockets
Log in or sign up for Devpost to join the conversation.