find and delete from more-dimensional numpy array

匿名 (未验证) 提交于 2019-12-03 02:33:02

问题:

I have two numpy-arrays:

p_a_colors=np.array([[0,0,0],                      [0,2,0],                      [119,103,82],                      [122,122,122],                      [122,122,122],                      [3,2,4]])  p_rem = np.array([[119,103,82],                      [122,122,122]]) 

I want to delete all the columns from p_a_colors that are in p_rem, so I get:

p_r_colors=np.array([[0,0,0],                     [0,2,0],                     [3,2,4]]) 

I think, something should work like

p_r_colors= np.delete(p_a_colors, np.where(np.all(p_a_colors==p_rem, axis=0)),0) 

but I just don't get the axis or [:] right.

I know, that

p_r_colors=copy.deepcopy(p_a_colors) for i in range(len(p_rem)):     p_r_colors= np.delete(p_r_colors, np.where(np.all(p_r_colors==p_rem[i], axis=-1)),0) 

would work, but I am trying to avoid (python)loops, because I also want the performance right.

回答1:

This is how I would do it:

dtype = np.dtype((np.void, (p_a_colors.shape[1] *                              p_a_colors.dtype.itemsize))) mask = np.in1d(p_a_colors.view(dtype), p_rem.view(dtype)) p_r_colors = p_a_colors[~mask]  >>> p_r_colors array([[0, 0, 0],        [0, 2, 0],        [3, 2, 4]]) 

You need to do the void dtype thing so that numpy compares rows as a whole. After that using the built-in set routines seems like the obvious way to go.



回答2:

It's ugly, but

tmp = reduce(lambda x, y: x |  np.all(p_a_colors == y, axis=-1), p_rem, np.zeros(p_a_colors.shape[:1], dtype=np.bool))  indices = np.where(tmp)[0]  np.delete(p_a_colors, indices, axis=0) 

(edit: corrected)

>>> tmp = reduce(lambda x, y: x |  np.all(p_a_colors == y, axis=-1), p_rem, np.zeros(p_a_colors.shape[:1], dtype=np.bool)) >>>  >>> indices = np.where(tmp)[0] >>>  >>> np.delete(p_a_colors, indices, axis=0) array([[0, 0, 0],        [0, 2, 0],        [3, 2, 4]]) >>>  


回答3:

You are getting the indices wrong. The expression p_a_colors==p_rem evaluates to an empty array, because the two arrays are never equal (they have different shapes!). If you want to use np.delete, you need a more correct list of indices.

On the other hand, this can be more easily done with indices:

>>> idx = np.array([p_a_colors[i] not in p_rem for i in                     range(p_a_colors.shape[0])], dtype='bool') >>> p_a_colors[idx] array([[0, 0, 0],        [0, 2, 0],        [3, 2, 4]]) 

Or, inspired by the suggestion of @Jaime, you can also create the indices with np.in1d, here in one line:

>>> idx = ~np.all(np.in1d(p_a_colors, p_rem).reshape(p_a_colors.shape),                    axis=1) >>> p_a_colors[idx] array([[0, 0, 0],        [0, 2, 0],        [3, 2, 4]]) 

If you must use np.delete, just convert the list of indices from bool to a sequence:

>>> idx = np.array([p_a_colors[i] in p_rem for i in                            range(p_a_colors.shape[0])]) >>> idx = np.arange(p_a_colors.shape[0])[idx] >>> np.delete(p_a_colors, idx, axis=0) array([[0, 0, 0],        [0, 2, 0],        [3, 2, 4]]) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!