How to select all locations of unique elements in numpy 2d array with bounding box around them?

匿名 (未验证) 提交于 2019-12-03 03:03:02

问题:

I have a 2D numpy array I want to find the 'every' location of all the unique elements. We can find the unique elements using numpy.unique(numpyarray.). Here it comes the tricky part. Now I have to know all the locations for every unique element. Lets consider the following example.

array([[1, 1, 2, 2],\        [1, 1, 2, 2],\        [3, 3, 4, 4],\        [3, 3, 4, 4]]) 

The result should be

1, (0,0),(1,1) 2, (0,2),(1,2) 3, (2,0),(3,1) 4, (2,2),(3,3) 

How to do it and what could be a suitable way to store and iterate over the values.

It is to be noted that all the unique values will be adjacent to each other. The only gaps between them could only be zeros. Lets consider another variant

 array([[1, 0, 1, 2, 2],\         [1, 0, 1, 2, 2],\         [3, 0, 3, 4, 4],\         [3, 0, 3, 4, 4]]) 

The result should be

1, (0,0),(1,2) 2, (0,3),(1,4) 3, (2,0),(3,2) 4, (2,3),(3,4) 

The zeoros on the boundaries are to be neglected.

thanks a lot

回答1:

The simple, brute force way to do it is to just use numpy.where.

For example, if you're just wanting the bounding box:

import numpy as np  x = np.array([[1,1,2,2],               [1,1,2,2],               [3,3,4,4],               [3,3,4,4]])  for val in np.unique(x):     rows, cols = np.where(x == val)     rowstart, rowstop = np.min(rows), np.max(rows)     colstart, colstop = np.min(cols), np.max(cols)     print val, (rowstart, colstart), (rowstop, colstop)  

This will work for the example with zeros, as well.

If the array is large, and you already have scipy around, you might consider using scipy.ndimage.find_objects instead, as @unutbu suggested.

In the particular case of your example, where your unique values are sequential integers, you can use find_objects directly. It expects an array where each sequential integer other than 0 represents an object that it needs to return the bounding box of. (0 is ignored, exactly as you want.) However, in general, you'll need to do a touch of pre-processing to convert arbitrary unique values to sequential integers.

find_objects retuns a list of tuples of slice objects. Honestly, these are probably exactly what you want, if you're wanting the bouding box. However, it will look a bit more messy to print out starting and stopping indicies.

import numpy as np import scipy.ndimage as ndimage  x = np.array([[1, 0, 1, 2, 2],               [1, 0, 1, 2, 2],               [3, 0, 3, 4, 4],               [3, 0, 3, 4, 4]])  for i, item in enumerate(ndimage.find_objects(x), start=1):     print i, item 

This will look slightly different than you might expect. These are slice objects, so the "max" value will always be one higher than the "max" in the previous example. This is so that you can simply slice with the given tuple to get the data in question.

E.g.

for i, item in enumerate(ndimage.find_objects(x), start=1):     print i, ':'     print x[item], '\n' 

If you really want the starts and stops, just do something like this:

    for i, (rowslice, colslice) in enumerate(ndimage.find_objects(x), start=1):         print i,          print (rowslice.start, rowslice.stop - 1),         print (colslice.start, colslice.stop - 1) 

If your unique values are not sequential integers, you'll need to do a bit of pre-processing, as I mentioned before. You might do something like this:

import numpy as np import scipy.ndimage as ndimage  x = np.array([[1.1, 0.0, 1.1, 0.9, 0.9],               [1.1, 0.0, 1.1, 0.9, 0.9],               [3.3, 0.0, 3.3, 4.4, 4.4],               [3.3, 0.0, 3.3, 4.4, 4.4]]) ignored_val = 0.0 labels = np.zeros(data.shape, dtype=np.int)  i = 1 for val in np.unique(x):     if val != ignored_val:         labels[x == val] = i         i += 1  # Now we can use the "labels" array as input to find_objects for i, item in enumerate(ndimage.find_objects(labels), start=1):     print i, ':'     print x[item], '\n' 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!