Alex Cookson

Alex Cookson https://alexcookson.com/ Recent content on Alex Cookson Hugo -- gohugo.io en-us © 2020. All rights reserved. Fri, 20 Nov 2020 00:00:00 +0000 Applying PCA to fictional character personalities https://alexcookson.com/post/2020-11-19-applying-pca-to-fictional-character-personalities/ Fri, 20 Nov 2020 00:00:00 +0000 https://alexcookson.com/post/2020-11-19-applying-pca-to-fictional-character-personalities/ In this post, we’re going to apply Principal Component Analysis (PCA) to a dataset of fictional character personalities. PCA is a common technique for dimensionality reduction, which you might want to do if you are, say, trying to put together a classification model and you have a dataset with a lot of variables. The dataset we’re using is of crowdsourced scores of personality traits for 800 fictional characters from books/movies/TV shows like Game of Thrones, Pride and Prejudice, and The Lion King. Building an animation step-by-step with gganimate https://alexcookson.com/post/2020-10-18-building-an-animation-step-by-step-with-gganimate/ Mon, 19 Oct 2020 00:00:00 +0000 https://alexcookson.com/post/2020-10-18-building-an-animation-step-by-step-with-gganimate/ Getting started with with {gganimate} is tough. There’s a big set of new functions and behaviours to learn. And the path from idea to polished animation – if you’re like me – is riddled with dead-ends, error messages, and exclamations of “Why is it doing that?!” In this post, I want to be your {gganimate} guide and take you down one possible path that starts with an idea and ends with something beautiful. Normalizing and rescaling children's book ratings (2 of 2) https://alexcookson.com/post/normalizing-childrens-book-ratings/ Mon, 29 Jun 2020 00:00:00 +0000 https://alexcookson.com/post/normalizing-childrens-book-ratings/ Note: this is the second part of a two-post series where I “fix” some of the problems with crowd-sourced ratings, like those you find for movies or books. (In this series, I look at children’s books.) In the first part, I incorporated a Bayesian prior into the rating calculation to address books with very few ratings sometimes having extreme scores (like 5 out of 5 stars) that likely don’t reflect their actual quality. Rating children's books with empirical Bayes estimation (1 of 2) https://alexcookson.com/post/rating-childrens-books-with-empirical-bayes-estimation/ Wed, 17 Jun 2020 00:00:00 +0000 https://alexcookson.com/post/rating-childrens-books-with-empirical-bayes-estimation/ Ratings sites – like Rotten Tomatoes and IMDb for movies or Goodreads for books – are annoying. They each seem to have their norms where the same rating means different things on different sites. A rating of 60% on one site might be good, but 6/10 (equivalent to 60%) on another site might be terrible. So you need to do some extra mental work to set your expectations based on the specific site you’re on. What can we learn from a country's diplomatic gifts? https://alexcookson.com/post/what-can-we-learn-from-diplomatic-gifts/ Thu, 21 May 2020 00:00:00 +0000 https://alexcookson.com/post/what-can-we-learn-from-diplomatic-gifts/ Have you ever brought a bottle of wine, flowers, or chocolate babka to a dinner party as a host/hostess gift? Or brought home a souvenir for your parents, partner, or kids after you’ve been travelling – like chocolate from Switzerland or, uh… Brazil nuts from Brazil? Countries do the same thing, kind of. Diplomatic gifts are often exchanged when dignitaries travel abroad or receive visitors. They can be lavish, like a $780,000 emerald and diamond jewellery set, given by King Abdullah of Saudi Arabia. What's the most successful Broadway show of all time? https://alexcookson.com/post/most-successful-broadway-show-of-all-time/ Thu, 23 Apr 2020 00:00:00 +0000 https://alexcookson.com/post/most-successful-broadway-show-of-all-time/ I love musicals! Who doesn’t?! That feeling when the lits dim at the beginning of the show. The intermission conversation (post-bathroom!) of which songs you enjoyed the most. Spending the rest of the week (maybe month?) humming your favourites to the annoyance of everyone around you. What’s that? Les Misérables is obviously the best musical? I know, I know. I mean, Hamilton is good and all that, and it deserves praise, but it’s no Les Mis (don’t @ me). How dangerous is climbing Mount Everest? https://alexcookson.com/post/how-dangerous-is-climbing-mount-everest/ Mon, 06 Apr 2020 00:00:00 +0000 https://alexcookson.com/post/how-dangerous-is-climbing-mount-everest/ In this series of posts, we will analyze climbing expeditions to the Himalayas, a mountain range comprising over 50 mountains, including Mount Everest, the tallest mountain in the world. This is Part 2 of a two-part series: Part 1 looked at Himalayan peaks and their first ascents Part 2 (this post) looks at Everest expeditions This post will focus on expeditions to Mount Everest, the most famous Himalayan peak and the tallest mountain in the world. Analyzing Himalayan peaks and first ascents https://alexcookson.com/post/analyzing-himalayan-peaks-first-ascents/ Sun, 22 Mar 2020 00:00:00 +0000 https://alexcookson.com/post/analyzing-himalayan-peaks-first-ascents/ In this series of posts, we will analyze climbing expeditions to the Himalayas, a mountain range comprising over 50 mountains, including Mount Everest, the tallest mountain in the world. This is Part 1 of a two-part series: Part 1 (this post) looks at Himalayan peaks and their first ascents Part 2 looks at how dangerous it is to climb Everest This post will focus on getting an overview of the Himalayan peaks, especially their height, whether they’ve been summitted, and (if it applies) when the first ascent was and who was involved. Mapping San Francisco's trees https://alexcookson.com/post/mapping-san-francisco-trees/ Wed, 29 Jan 2020 00:00:00 +0000 https://alexcookson.com/post/mapping-san-francisco-trees/ In this post, I create some basic geographical maps using the San Francisco Trees dataset from TidyTuesday, a project that shares a new dataset each wee to give R users a way to apply and practice their skills. Getting started with geographical mapping in R can be daunting because there is a lot of terminology to describe a lot of methods that are specific to mapping. There is a whole discipline – Geographic Information Systems – dedicated to this stuff, so it’s no surprise that it can get complicated fast. Heat mapping the timing of Philadelphia parking tickets https://alexcookson.com/post/tidytuesday-philadelphia-parking-tickets/ Thu, 05 Dec 2019 00:00:00 +0000 https://alexcookson.com/post/tidytuesday-philadelphia-parking-tickets/ In this post, I create heat maps using the Philly Parking Tickets dataset from TidyTuesday, a project that shares a new dataset each week to give R users a way to apply and practice their skills. Specifically, we’ll cover: Cleaning and aggregating the data that will go into our heat map Creating a basic heat map with ggplot2 defaults Tweaking ggplot2 theme components to get a much prettier heat map Predicting horror movie ratings with LASSO regression https://alexcookson.com/post/tidytuesday-horror-movies/ Mon, 21 Oct 2019 00:00:00 +0000 https://alexcookson.com/post/tidytuesday-horror-movies/ In this post, I look at the Horror movie ratings dataset from TidyTuesday, a project that shares a new dataset each week to give R users a way to apply and practice their skills. We’re going to run a LASSO regression, a type of regularization. Regularization is often used when you have lots of predictors (compared to your number of observations) or when your data has multi-collinearity – predictors that are highly correlated with one another. How much can professional powerlifters bench press? https://alexcookson.com/post/tidytuesday-powerlifting/ Tue, 08 Oct 2019 00:00:00 +0000 https://alexcookson.com/post/tidytuesday-powerlifting/ In this post, I analyze the Powerlifting dataset from TidyTuesday, a project that shares a new dataset each week to give R users a way to apply and practice their skills. This week’s data is about the results of powerlifting events that are part of the International Powerlifting Federation. I will be predicting bench press weight with a multiple linear regression model. What’s more, I will be using natural cubic splines to incorporate non-linear trends into our model. What are New York's best and worst pizza restaurants? https://alexcookson.com/post/tidytuesday-new-york-pizza-restaurants/ Mon, 30 Sep 2019 00:00:00 +0000 https://alexcookson.com/post/tidytuesday-new-york-pizza-restaurants/ In this post, I analyze the Pizza Party dataset from TidyTuesday, a project that shares a new dataset each week to give R users a way to apply and practice their skills. This week’s data is about survey ratings of New York pizza restaurants. Setup First, let’s load the tidyverse, change our default ggplot2 theme, and load the data. (I named the dataframe pizza_barstool_raw because I’ll probably add some cleaning steps and I like to have the original data on hand. Finding trends in US national park visits https://alexcookson.com/post/tidytuesday-us-national-parks/ Mon, 16 Sep 2019 00:00:00 +0000 https://alexcookson.com/post/tidytuesday-us-national-parks/ In this post, I analyze the National Park Visits dataset from TidyTuesday, a project that shares a new dataset each week to give R users a way to apply and practice their skills. This week’s data is about visitor numbers for US National Parks, going way back to 1904, when there were only six national parks. I’ve never been to a US national park, but I know about some of the famous ones like Yosemite and Yellowstone. About me https://alexcookson.com/about/ Mon, 01 Jan 0001 00:00:00 +0000 https://alexcookson.com/about/ Hi! I’m Alex. I love data science, cycling, and cats. I currently work at the Royal Canadian Mint, where I help the marketing team make the most of their data. I’ve worked on many machine learning and business analytics projects, including predictive models for customer churn, recommendation engines, and self-service dashboards. I have Master’s degrees in International Business from HEC Paris and Queen’s University. I also earned my Bachelor’s degree from Queen’s University, where I studied commerce.