0

I'm confused about the difference between Pandas Series objects when using reindex_like and related features. For example, consider the following Series objects:

>>> import numpy
>>> import pandas
>>> series = pandas.Series([1, 2, 3])
>>> x = pandas.Series([True]).reindex_like(series).fillna(True)
>>> y = pandas.Series(True, index=series.index)
>>> x
0    True
1    True
2    True
>>> y
0    True
1    True
2    True

On the surface x and y appear to be identical in their contents and indexing. However, they must be different in some way because one of them causes an error when using numpy.logical_and() and the other does not.

>>> numpy.logical_and(series, y)
0    True
1    True
2    True
>>> numpy.logical_and(series, x)
Traceback (most recent call last):
  File "<ipython-input-10-e2050a2015bf>", line 1, in <module>
    numpy.logical_and(series, x)
AttributeError: logical_and

What is numpy.logical() and complaining about here? I don't see the difference between the two series, x and y. However, there must be some subtle difference.

The Pandas documentation says the Series object is a valid argument to "most NumPy functions." Clearly this is true somewhat in this case. Apparently the creation mechanism makes x unusable to this particular numpy function.

As a side-note, which of the two creation mechanisms, reindex_like() and the index argument are more efficient and idiomatic for this scenario? Maybe there is another/better way I haven't considered also.

5
  • Which pandas/numpy/python version are you using? numpy.logical_and(series, x) works without error for me in pandas-0.9.0-py2.7... Commented Nov 29, 2012 at 16:03
  • Using pandas 0.9.1 and numpy 1.6.2 Commented Nov 29, 2012 at 16:11
  • I would be surprised if this is a new bug in pandas 0.9.1, but perhaps worth upgrading to numpy version 1.8? (the version I am using which seems to work...) Commented Nov 29, 2012 at 16:19
  • The current stable release of numpy is 1.6.2 (via PyPi and numpy site). Are you using the github repo of numpy for version 1.8? I think I'll post a github issue for pandas and see what they say. Commented Nov 29, 2012 at 16:25
  • I posted an issue on github for Pandas. Maybe they will have more insight: github.com/pydata/pandas/issues/2388 Commented Nov 29, 2012 at 16:32

1 Answer 1

0

It looks like this is not a bug and the subtle difference is due to the usage of the reindex_like() method. The call to reindex_like() inserts some NaN data into the series so the dtype of that series changes from bool to object.

>>> series = pandas.Series([1, 2, 3])
>>> x = pandas.Series([True])
>>> x.dtype
dtype('bool')
>>> x = pandas.Series([True]).reindex_like(series)
>>> x.dtype
dtype('object')

I posted an issue about this anomaly on the Pandas github page.

The full explanation/discussion is here. It looks like this behavior could potentially change in the future so watch the issue for more on-going details.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.