How to crop a numpy 2d array to non-zero values?

后端 未结 3 1226
孤独总比滥情好
孤独总比滥情好 2020-12-11 18:48

Let\'s say i have a 2d boolean numpy array like this:

import numpy as np
a = np.array([
    [0,0,0,0,0,0],
    [0,1,0,1,0,0],
    [0,1,1,0,0,0],
    [0,0,0,0         


        
相关标签:
3条回答
  • 2020-12-11 19:16

    Here's one with slicing and argmax to get the bounds -

    def smallestbox(a):
        r = a.any(1)
        if r.any():
            m,n = a.shape
            c = a.any(0)
            out = a[r.argmax():m-r[::-1].argmax(), c.argmax():n-c[::-1].argmax()]
        else:
            out = np.empty((0,0),dtype=bool)
        return out
    

    Sample runs -

    In [142]: a
    Out[142]: 
    array([[False, False, False, False, False, False],
           [False,  True, False,  True, False, False],
           [False,  True,  True, False, False, False],
           [False, False, False, False, False, False]])
    
    In [143]: smallestbox(a)
    Out[143]: 
    array([[ True, False,  True],
           [ True,  True, False]])
    
    In [144]: a[:] = 0
    
    In [145]: smallestbox(a)
    Out[145]: array([], shape=(0, 0), dtype=bool)
    
    In [146]: a[2,2] = 1
    
    In [147]: smallestbox(a)
    Out[147]: array([[ True]])
    

    Benchmarking

    Other approach(es) -

    def argwhere_app(a): # @Jörn Hees's soln
        coords = np.argwhere(a)
        x_min, y_min = coords.min(axis=0)
        x_max, y_max = coords.max(axis=0)
        return a[x_min:x_max+1, y_min:y_max+1]
    

    Timings for varying degrees of sparsity (approx. 10%, 50% & 90%) -

    In [370]: np.random.seed(0)
         ...: a = np.random.rand(5000,5000)>0.1
    
    In [371]: %timeit argwhere_app(a)
         ...: %timeit smallestbox(a)
    1 loop, best of 3: 310 ms per loop
    100 loops, best of 3: 3.19 ms per loop
    
    In [372]: np.random.seed(0)
         ...: a = np.random.rand(5000,5000)>0.5
    
    In [373]: %timeit argwhere_app(a)
         ...: %timeit smallestbox(a)
    1 loop, best of 3: 324 ms per loop
    100 loops, best of 3: 3.21 ms per loop
    
    In [374]: np.random.seed(0)
         ...: a = np.random.rand(5000,5000)>0.9
    
    In [375]: %timeit argwhere_app(a)
         ...: %timeit smallestbox(a)
    10 loops, best of 3: 106 ms per loop
    100 loops, best of 3: 3.19 ms per loop
    
    0 讨论(0)
  • 2020-12-11 19:19

    After some more fiddling with this, i actually found a solution myself:

    coords = np.argwhere(a)
    x_min, y_min = coords.min(axis=0)
    x_max, y_max = coords.max(axis=0)
    b = cropped = a[x_min:x_max+1, y_min:y_max+1]
    

    The above works for boolean arrays out of the box. In case you have other conditions like a threshold t and want to crop to values larger than t, simply modify the first line:

    coords = np.argwhere(a > t)
    
    0 讨论(0)
  • 2020-12-11 19:29
    a = np.transpose(a[np.sum(a,1) != 0])
    a = np.transpose(a[np.sum(a,1) != 0])
    

    It's not the quickest but it's alright.

    0 讨论(0)
提交回复
热议问题