Source Code
https://files.catbox.moe/y1ta9y.zip
Inspiration
Well, I heard that Reddit was pretty hard to make a trend analysis program out of, owing to its terrible search API and its API being paid on top of that. So because of that reason I decided to make a Reddit trend analyzer.
What it does
Goes through at most 100 Reddit posts in one month, for the requested number of months, aggregates and ranks terms in those titles and body text, builds a time series map, then plots that time series onto an svg file that is then displayed in the program.
How we built it
The core of the application is the PullPush API, which I wrote a wrapper crate for. This is basically our replacement for the Reddit API, and, on top of being free, also has much more sane search capabilities. It also uses rust-tfidf to rank terms based on their frequency in post titles and body text.
Challenges we ran into
Doing this project with Reddit felt a lot like doing it on hard mode because of the obvious, Reddit's API was paid and it was terrible at searching. The before and after parameters did not operate on dates or Unix Epoch times but by post IDs (???), which was unintuitive. So I swapped the Reddit API for PullPush API. But even though the search was better, there were still other small problems. For example, the over_18 query parameter would just not work for some reason. This shouldn't be a problem if you aren't going in subreddits with 18+ content in the first place, but it's something to note. Also, there's a fairly arbitrary limit as to the number of posts you can get at a time.
Unlike some other platforms, there are no tags in Reddit. So you're forced to use the title and body text, which inevitably will have a lot of (functional) garbage that you don't need, like punctuation, overused stopwords "the", "a", "is" that will poison the trend set, etc.
By the time I was done with the actual data collection and visualization part, I had basically no time to finish up the GUI, so I made a super basic one.
Then after hacking up a basic GUI I was too tired to make any more enhancements to the project, so what you see is what you get.
Accomplishments that we're proud of
Built a GUI. Yes, this is actually the first GUI I've built ever, and it looks awful but the point is that it works perfectly fine.
Also I built a trend analyzer that actually analyzes trends without the requirement for tags (though it does so very roughly), so I guess that's a thing too.
What we learned
- Basic natural language processing
- Building a GUI
- Making wrappers for APIs
What's next for reddit-analyzer
Nothing, probably.
Log in or sign up for Devpost to join the conversation.