9

Say I have an array x = np.arange(6).reshape(3, 2).

What is the meaning of x[False], or x[np.asanyarray(False)]? Both result in array([], shape=(0, 3, 2), dtype=int64), which is unexpected.

I expected to get an IndexError because of an improperly sized mask, as from something like x[np.ones((2, 2), dtype=np.bool)].

This behavior is consistent for x[True] and x[np.asanyarray(True)], as both result in an additional dimension: array([[[0, 1], [2, 3], [4, 5]]]).

I am using numpy 1.13.1. It appears that the behavior has changed recently, so while it is nice to have answers for older versions, please mention your version in the answers.

EDIT

Just for completeness, I filed https://github.com/numpy/numpy/issues/9515 based on the commentary on this question.

EDIT 2

And closed it almost immeditely.

11
  • What NumPy version are you in? I get array([0, 1]) as a result. And this is because False is treated as 0, --> x[0] (in 1.11.3) Commented Aug 3, 2017 at 19:47
  • 1
    @BradSolomon It was changed in the last version: docs.scipy.org/doc/numpy-dev/… (Boolean indexing into scalar arrays return a new 1-d array. This means that array(1)[array(True)] gives array([1]) and not the original array.) Commented Aug 3, 2017 at 19:49
  • @BradSolomon. Version 1.13.1, False will be treated as an integer, unless you pass in a boolean matrix, as I have shown in my expected example. I am fine with the idea of x[False] == x[0], but not so much with x[np.array(False)] == x[0]. Neither seems to be happening. Commented Aug 3, 2017 at 19:50
  • @ayhan: No, that's a different change. Commented Aug 3, 2017 at 19:50
  • 3
    @ayhan: The relevant part here is a bit higher up than the part you quoted: "Boolean scalars (including python True) are legal boolean indexes and never treated as integers." Commented Aug 3, 2017 at 19:51

1 Answer 1

7

There's technically no requirement that the dimensionality of a mask match the dimensionality of the array you index with it. (In previous versions, there were even fewer restrictions, and you could get away with some extreme shape mismatches.)

The docs describe boolean indexing as

A single boolean index array is practically identical to x[obj.nonzero()] where, as described above, obj.nonzero() returns a tuple (of length obj.ndim) of integer index arrays showing the True elements of obj.

but nonzero is weird for 0-dimensional input, so this case is one of the ways that "practically identical" turns out to be not identical:

the nonzero equivalence for Boolean arrays does not hold for zero dimensional boolean arrays.

NumPy has a special case for a 0-dimensional boolean index, motivated by the desire to have the following behavior:

In [3]: numpy.array(3)[True]
Out[3]: array([3])

In [4]: numpy.array(3)[False]
Out[4]: array([], dtype=int64)

I'll refer to a comment in the source code that handles a 0-dimensional boolean index:

if (PyArray_NDIM(arr) == 0) {
    /*
     * This can actually be well defined. A new axis is added,
     * but at the same time no axis is "used". So if we have True,
     * we add a new axis (a bit like with np.newaxis). If it is
     * False, we add a new axis, but this axis has 0 entries.
     */

While this is primarily intended for a 0-dimensional index to a 0-dimensional array, it also applies to indexing multidimensional arrays with booleans. Thus,

x[True]

is equivalent to x[np.newaxis], producing a result with a new length-1 axis in front, and

x[False]

produces a result with a new axis in front of length 0, selecting no elements.

Sign up to request clarification or add additional context in comments.

2 Comments

This is great. I will just check my masks using np.array(mask, dtype=np.bool, copy=False, subok=True, ndmin=1). That should remove the "weird" behavior.
Closed the issue.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.