6

Consider df:

In [2098]: df = pd.DataFrame({'a': [1,2], 'b':[3,4]})

In [2099]: df
Out[2099]: 
   a  b
0  1  3
1  2  4

Now, I try to append a list of values to df:

In [2102]: df.loc[2] = [3, 4]

In [2103]: df
Out[2103]: 
   a  b
0  1  3
1  2  4
2  3  4

All's good so far.

But now when I try to append a row with list of boolean values, it converts it into int:

In [2104]: df.loc[3] = [True, False]

In [2105]: df
Out[2105]: 
   a  b
0  1  3
1  2  4
2  3  4
3  1  0

I know I can convert my df into str and can then append boolean values, like:

In [2131]: df = df.astype(str)
In [2133]: df.loc[3] = [True, False]

In [2134]: df
Out[2134]: 
      a      b
0     1      3
1     2      4
3  True  False

But, I want to know the reason behind this behaviour. Why is it not automatically changing the dtypes of columns to object when I append boolean to it?

My Pandas version is:

In [2150]: pd.__version__
Out[2150]: '1.1.0'
8
  • '1.1.0' is my pandas version. Commented Dec 23, 2020 at 8:07
  • 3
    In my opinion mixing types are not recommended, so it should working buggy. Same problem if use df.append(pd.Series([True, False], index=['a','b']), ignore_index=True ) Commented Dec 23, 2020 at 8:13
  • 1
    @MayankPorwal Because as I said in Python (not sure if pandas and numpy do the same) the boolean are a subclass of integers: docs.python.org/3/reference/datamodel.html#index-10 Commented Dec 23, 2020 at 8:13
  • 1
    Agreed with Dani , since python booleans are binary ints: 1+True returns 2, same way it returns a binary int when you add True and False Commented Dec 23, 2020 at 8:18
  • 1
    @jezrael Yes, same problem with series also. Commented Dec 23, 2020 at 8:20

2 Answers 2

3

Why is it not automatically changing the dtypes of columns to object when I append boolean to it?

Because the type are being upcasted (see upcasting), from the documentation:

Types can potentially be upcasted when combined with other types, meaning they are promoted from the current type (e.g. int to float).

Upcasting works according to numpy rules:

Upcasting is always according to the numpy rules. If two different dtypes are involved in an operation, then the more general one will be used as the result of the operation.

To understand how the numpy rules are applied you can use the function find_common_type, as below:

res = np.find_common_type([bool, np.bool], [np.int32, np.int64])
print(res)

Output

int64
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot for this explanation.
0

When you do df.loc[0] it converts into an pd.Series, as shown below:

>>> type(df.loc[0])
<class 'pandas.core.series.Series'>

And now, Series will only have a single dtype. So it coerces the booleans to integers.

So the way to fix is to use df.loc[[0]] if you are trying to get the rows:

>>> type(df.loc[[0]])
<class 'pandas.core.frame.DataFrame'>

But in this case, you need to create 2 new empty rows then add the values with df.loc[[...]] because df.loc[[...]] is only for indexing, you can't assign new rows with that.

So here is how you can get the rows with df.loc[[...]]:

>>> df = pd.DataFrame({'a': [1,2], 'b':[3,4]})
>>> df.loc[0]
a    1
b    3
Name: 0, dtype: int64
>>> df.loc[[0]]
   a  b
0  1  3
>>> 

Here you see the difference, the first code converts to a Series with only one dtype whereas the second code gives a DataFrame.

But for this case you can't use the df.loc[[...]], since you can't assign things with that, so you only can go with creating new empty rows then using df.loc[[...]]:

>>> df = pd.DataFrame({'a': [1,2], 'b':[3,4]})
>>> df
   a  b
0  1  3
1  2  4
>>> df.loc[2] = [3, 4]
>>> df
   a  b
0  1  3
1  2  4
2  3  4
>>> df.loc[3] = 0
>>> df
   a  b
0  1  3
1  2  4
2  3  4
3  0  0
>>> df.loc[[3]] = [True, False]
>>> df
      a      b
0     1      3
1     2      4
2     3      4
3  True  False
>>> 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.