Intersection of 2d and 1d Numpy array

问题

For every element in array A[:,3:] that is also in array B, I want to set the value to 0, which creates the array result

import numpy as np

A = np.array([[1, 1, 10, 101, 102, 103,   0,   0],
              [2, 2, 10, 102, 108,   0,   0,   0],
              [3, 3, 11, 101, 102, 106, 107, 108]])

B = np.array([101, 106, 108])

result = np.array([[1, 1, 10,   0, 102, 103,   0,   0],
                   [2, 2, 10, 102,   0,   0,   0,   0],
                   [3, 3, 11,   0, 102,   0, 107,   0]])

I know there is a way to do this using in1d and broadcasting A as a 1D array, but I have no idea how to go about this.

Any help would be greatly appreciated.

回答1:

If you feed in that sliced 2D array A[:,3:] to np.in1d, it would flatten it to a 1D array and compare with B for occurrences and thus create a 1D mask, which could be reshaped and used for boolean indexing into that sliced array to set the TRUE elements to zeros. A one-liner implementation would look something like this -

A[:,3:][np.in1d(A[:,3:],B).reshape(A.shape[0],-1)] = 0

Sample run -

In [37]: A
Out[37]: 
array([[  1,   1,  10, 101, 102, 103,   0,   0],
       [  2,   2,  10, 102, 108,   0,   0,   0],
       [  3,   3,  11, 101, 102, 106, 107, 108]])

In [38]: np.in1d(A[:,3:],B) # Flattened mask
Out[38]: 
array([ True, False, False, False, False, False,  True, False, False,
       False,  True, False,  True, False,  True], dtype=bool)

In [39]: np.in1d(A[:,3:],B).reshape(A.shape[0],-1) # Reshaped mask
Out[39]: 
array([[ True, False, False, False, False],
       [False,  True, False, False, False],
       [ True, False,  True, False,  True]], dtype=bool)

In [40]: A[:,3:][np.in1d(A[:,3:],B).reshape(A.shape[0],-1)] = 0 # Final code

In [41]: A
Out[41]: 
array([[  1,   1,  10,   0, 102, 103,   0,   0],
       [  2,   2,  10, 102,   0,   0,   0,   0],
       [  3,   3,  11,   0, 102,   0, 107,   0]])

To make things simpler, you could create a view of the flattened A and use the 1D mask obtained from np.in1d to have a more elegant solution. For a solution that changes only the sliced A[:,3:], you can use .flat and then index like so -

A[:,3:].flat[np.in1d(A[:,3:],B)] = 0

For a case when you would like to set matching ones across entire A, you can use .ravel() -

A.ravel()[np.in1d(A,B)] = 0

I know .ravel() is a view and from the docs, it seems .flat doesn't create a copy either, so these should be cheap.

回答2:

Here's a way to do this without using in1d(). You can use the regular Python in operator with a ravel-ed version of your array:

listed = [aa  in B for aa in A[:, 3:].ravel()]

# mask for unaffected left columns of A
mask1 = np.array([False]*A.shape[0]*3)
mask1.shape = (A.shape[0], 3)

# mask for affected right columns of A
mask2 = np.array(listed)
mask2.shape = (A.shape[0], A.shape[1]-3)

# join masks together so you have a mask with same dimensions as A
mask = np.hstack((mask1, mask2))

result  = A.copy()
result[mask] = 0

Or more succinctly:

listed = [aa  in B for aa in A[:, 3:].ravel()]
listed_array = np.array(listed)
listed.shape = (A.shape[0], A.shape[1]-3)
A[:, 3:][listed_array] = 0

You're probably better off with in1d() but it's nice to know there are other options.

来源：https://stackoverflow.com/questions/32481491/intersection-of-2d-and-1d-numpy-array

标签

python

arrays

numpy

intersection