Efficient identification of adjacent elements in numpy matrix

Question

I have a 100 by 100 numpy matrix. The matrix is mostly filled with zeros, but also contains some number of ints. For example:

[0 0 0 0 0 0 0 1]
[0 2 2 0 0 0 0 0]
[0 0 2 0 0 0 0 0]  False
[0 0 0 0 0 0 0 0]
[0 3 3 0 0 0 0 0]

What is the most efficient way to identify whether the matrix contains any number of adjacent ints of distinct types?

The above example would return False. Here is a True example, with the row containing the adjacency indicated:

[0 0 0 0 0 0 0 1]
[0 2 2 1 1 0 0 0]   <----  True
[0 0 2 0 0 0 0 0]  
[0 0 0 0 0 0 0 0]
[0 3 3 0 0 0 0 0]

Diagonals do not count as being adjacent. So this example would also return False:

[0 0 0 1 1 1 1 1]
[0 2 2 0 1 0 0 0]
[0 0 2 0 0 0 0 0]   False
[0 3 0 0 0 0 0 0]
[3 3 3 0 0 0 0 0]

I do not need to identify the position of the adjacency, just whether it exists of not.

Currently, I cannot do better than finding each no-zero element in the matrix and then checking its 4 flanking elements.

Thanks for all the great answers.

Without posting a full answer, numpy.diff with an axis parameter is where I would start. — Benjamin
– Benjamin, Commented Dec 19, 2016 at 2:26
Could you add the code that does check each element. That could remove some lingering ambiguity about the test. — hpaulj
– hpaulj, Commented Dec 19, 2016 at 4:37

MSeifert · Accepted Answer · 2016-12-19 05:35:28Z

If you can use scipy this will be quite easy using ndimage.label and ndimage.labeled_comprehension:

import numpy as np
from scipy import ndimage

def multiple_unique_item(array):
    return len(np.unique(array)) > 1

def adjacent_diff(array):
    labeled_array, num_labels = ndimage.label(array)
    labels = np.arange(1, num_labels+1)
    any_multiple = ndimage.labeled_comprehension(array, labeled_array, labels, 
                                                 multiple_unique_item, bool, 0)
    return any_multiple.any()

label defaults to label neighboring values different from 0 without diagonals. Then the comprehension passes all values associated with a label to the helper function - which checks if more than one unique value is present. Finally it checks if any label had more than one value and returns this.

To test this on your test input arrays:

arr1 = np.array([[0,0,0,0,0,0,0,1],
                 [0,2,2,1,1,0,0,0],  
                 [0,0,2,0,0,0,0,0],
                 [0,0,0,0,0,0,0,0],
                 [0,3,3,0,0,0,0,0]])

arr2 = np.array([[0,0,0,1,1,1,1,1],
                 [0,2,2,0,1,0,0,0],
                 [0,0,2,0,0,0,0,0],  
                 [0,3,0,0,0,0,0,0],
                 [3,3,3,0,0,0,0,0]])

arr3 = np.array([[0,0,0,0,0,0,0,1],
                 [0,2,2,0,0,0,0,0],
                 [0,0,2,0,0,0,0,0],  
                 [0,0,0,0,0,0,0,0],
                 [0,3,3,0,0,0,0,0]])

>>> adjacent_diff(arr1)
True
>>> adjacent_diff(arr2)
False
>>> adjacent_diff(arr3)
False

Andras Deak -- Слава Україні · Accepted Answer · 2016-12-19 03:08:34Z

Looking at the description of your problem, it's probably not too much computational effort to check each possible nonzero integer value for its position in the array, and see if there are intersections. Now, this is generally overkill, but on your scale it might work: you can get the indices of each integer collection, and compute their distances using scipy.spatial.distance.cdist. I'm sure that some smart solution based on diff or something else is more efficient, but I had fun anyway:

import numpy as np
from scipy.spatial.distance import cdist
from itertools import combinations

M1 = np.array(
[[0,0,0,0,0,0,0,1],
 [0,2,2,1,1,0,0,0],  
 [0,0,2,0,0,0,0,0],
 [0,0,0,0,0,0,0,0],
 [0,3,3,0,0,0,0,0]])

M2 = np.array(
[[0,0,0,1,1,1,1,1],
 [0,2,2,0,1,0,0,0],
 [0,0,2,0,0,0,0,0],  
 [0,3,0,0,0,0,0,0],
 [3,3,3,0,0,0,0,0]])

def overlaps_eh(M):
    uniques = np.delete(np.unique(M),0) # get integers present
    unival_inds = [np.transpose(np.where(M==unival)) for unival in uniques]
    # unival_inds[k] contains the i,j indices of each element with the kth unique value

    for i1,i2 in combinations(range(len(unival_inds)),2):
        if np.any(cdist(unival_inds[i1],unival_inds[i2],'cityblock')==1):
            return True
    # if we're here: no adjacencies
    return False

First we collect the nonzero unique matrix elements into the array uniques. Then for each unique value we find the i,j indices of each element in the input array that has this value. Then we check each pair of unique values (generated using itertools.combinations), and use scipy.spatial.distance.cdist to measure the pair-wise distance of each pair of matrix elements. Using the Manhattan distance, if any pair of elements has distance 1, they are adjacent. So we only have to return True in case any of these distances is 1, otherwise we return False.

Divakar · Accepted Answer · 2016-12-19 05:38:19Z

3

Here's an approach making heavy usage of slicing that are just views with focus on performance -

def distinct_ints(a):
    # Mask of zeros, non-zeros as we would use them frequently
    zm = a==0
    nzm = ~zm

    # Look for distint ints across rows
    row_thresh = (nzm[:,1:] & zm[:,:-1]).sum(1)
    row_out = ((nzm[:,1:] & (a[:,1:] != a[:,:-1])).sum(1)>row_thresh).any()

    # Look for distint ints across cols
    col_thresh = (nzm[1:] & zm[:-1]).sum(0)
    col_out = ((nzm[1:] & (a[1:] != a[:-1])).sum(0)>col_thresh).any()

    # Any from rows or cols
    out = row_out | col_out
    return out

edited Dec 19, 2016 at 5:38

answered Dec 19, 2016 at 5:22

Divakar

222k19 gold badges274 silver badges374 bronze badges

Comments

user6655984 · Accepted Answer · 2016-12-19 03:22:19Z

2

Here's a solution using a masked array:

import numpy as np
import numpy.ma as ma
a = np.array([[0,1,0], [0,1,0], [2,2,2]])    # sample data 
x = ma.masked_equal(a, 0)                    # mask zeros
adjacencies = np.count_nonzero(np.diff(x, axis=0).filled(0)) + np.count_nonzero(np.diff(x, axis=1).filled(0))

On the last line, diff is applied to the masked array (zero entries ignored); the nonzero entries in diff mean adjacent different nonzero entries in array a. The variable adjacencies will have the total number of adjacencies (perhaps you only want to know if it's 0 or not). In the above example, it is 1.

answered Dec 19, 2016 at 3:22

user6655984

Comments

Benjamin · Accepted Answer · 2016-12-19 15:05:55Z

1

It is possible to do it with numpy.diff, however, the fact that the zeros are not supposed to be considered complicates things a bit.

You can set zeros to a value large or small enough not to cause issues:

a[a == 0] = -999

Or use a float array and set them to nan or inf:

a[a == 0] = numpy.nan

Then simply look for first order differences of 1 in each direction:

numpy.any(numpy.abs(numpy.diff(a, axis=0)) == 1) or numpy.any(numpy.abs(numpy.diff(a, axis=1)) == 1)

edited Dec 19, 2016 at 15:05

answered Dec 19, 2016 at 14:59

Benjamin

12k13 gold badges75 silver badges121 bronze badges

Collectives™ on Stack Overflow

Efficient identification of adjacent elements in numpy matrix

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related