I have a pandas dataframe with a categorical series that has missing categories.
In the example shown below, group has the categories "a", "b", and "c", but there are no cases of "c" in the dataframe.
import pandas as pd
dfr = pd.DataFrame({
"id": ["111", "222", "111", "333"],
"group": ["a", "a", "b", "b"],
"value": [1, 4, 9, 16]})
dfr["group"] = pd.Categorical(dfr["group"], categories=["a", "b", "c"])
dfr.pivot(index="id", columns="group")
The resulting pivoted dataframe has columns a and b. I expected a c column containing all missing value as well.
value
group a b
id
111 1.0 9.0
222 4.0 NaN
333 NaN 16.0
How can I pivot a dataframe on a categorical series to include columns with all categories, regardless of whether they were present in the original dataframe?