numpy: split 1D array of chunks separated by nans into a list of the chunks

后端 未结 2 1720
广开言路
广开言路 2020-12-14 13:13

I have a numpy array with only some values being valid and the rest being nan. example:

[nan,nan, 1 , 2 , 3 , nan, nan, 10, 11 , nan, nan, nan, 23, 1, nan,          


        
相关标签:
2条回答
  • 2020-12-14 13:19

    Here is another possibility:

    import numpy as np
    nan = np.nan
    
    def using_clump(a):
        return [a[s] for s in np.ma.clump_unmasked(np.ma.masked_invalid(a))]
    
    x = [nan,nan, 1 , 2 , 3 , nan, nan, 10, 11 , nan, nan, nan, 23, 1, nan, 7, 8]
    
    In [56]: using_clump(x)
    Out[56]: 
    [array([ 1.,  2.,  3.]),
     array([ 10.,  11.]),
     array([ 23.,   1.]),
     array([ 7.,  8.])]
    

    Some benchmarks comparing using_clump and using_groupby:

    import itertools as IT
    groupby = IT.groupby
    def using_groupby(a):
        return [list(v) for k,v in groupby(a,np.isfinite) if k]
    

    In [58]: %timeit using_clump(x)
    10000 loops, best of 3: 37.3 us per loop
    
    In [59]: %timeit using_groupby(x)
    10000 loops, best of 3: 53.1 us per loop
    

    The performance is even better for larger arrays:

    In [9]: x = x*1000
    In [12]: %timeit using_clump(x)
    100 loops, best of 3: 5.69 ms per loop
    
    In [13]: %timeit using_groupby(x)
    10 loops, best of 3: 60 ms per loop
    
    0 讨论(0)
  • 2020-12-14 13:21

    I'd use itertools.groupby -- It might be slightly faster:

    from numpy import NaN as nan
    import numpy as np
    a = np.array([nan,nan, 1 , 2 , 3 , nan, nan, 10, 11 , nan, nan, nan, 23, 1, nan, 7, 8])
    from itertools import groupby
    result = [list(v) for k,v in groupby(a,np.isfinite) if k]
    print result #[[1.0, 2.0, 3.0], [10.0, 11.0], [23.0, 1.0], [7.0, 8.0]]
    
    0 讨论(0)
提交回复
热议问题