Identifying the percentiles of a dataset
The percentile is an interesting statistic because it can be used to measure the spread of a dataset and, at the same time, identify the center of a dataset. The percentile divides the dataset into 100 equal portions, allowing us to determine the values in a dataset above or below a certain limit. Typically, 99 percentiles will split your dataset into 100 equal portions. The value of the 50th percentile is the same value as the median.
To analyze the percentile of a dataset, we will use the percentile method from the numpy library in Python.
Getting ready
We will work with the COVID-19 cases again for this recipe.
How to do it…
We will compute the 60th percentile using the numpy library:
- Import the
numpyandpandaslibraries:import numpy as np import pandas as pd
- Load the
.csvinto a dataframe usingread_csv. Then subset the dataframe to include only relevant columns:covid_data = pd.read_csv("covid-data...