I have a DataFrame below which has some missing values.
df = pd.DataFrame(data=[['A', 1, None], ['B', 2, 5]],
columns=['X', 'Y', 'Z'])
Since df['Z'] is supposed to be an integer column, I changed its data type to pandas new experimental type nullable integer as below.
ydf['Z'] = ydf['Z'].astype(pd.Int32Dtype())
ydf
X Y Z
0 A 1 <NA>
1 B 2 5
Now I am trying to use a simple numpy where method to replace the non-null values in the column df['Z'] with a fixed integer value (say 1) using the code below.
np.where(pd.isna(ydf['Z']), pd.NA, np.where(ydf['Z'] > 0, 1, 0))
But I get the following error, and I am unable to understand why as I am already checking for the rows with null values in the first condition.
TypeError: boolean value of NA is ambiguous
np.where(ydf['Z'] > 0, 1, 0)is throwing the error.np.whereexpected an array of booleans only, butydf['Z'] > 0returns nans like<NA>df['Z'] > 0(wheredfis the original df, before converting it to the new Int32 type) returnsFalsefor nan.