Extend numpy mask by n cells to the right for each bad value, efficiently

前端 未结 7 1475
[愿得一人]
[愿得一人] 2021-02-15 15:39

Let\'s say I have a length 30 array with 4 bad values in it. I want to create a mask for those bad values, but since I will be using rolling window functions, I\'d also like a f

7条回答
  •  轮回少年
    2021-02-15 15:56

    You can use the same cumsum trick as you would for a running average filter:

    def cumsum_trick(a, n):
        mask = np.isnan(a)
        cs = np.cumsum(mask)
        cs[n:] -= cs[:-n].copy()
        return cs > 0
    

    Unfortunately the additional .copy() is needed, because of some buffering that goes on internally the order of operations. It is possible to persuade numpy to apply the subtraction in reverse, but for that to work the cs array must have a negative stride:

    def cumsum_trick_nocopy(a, n):
        mask = np.isnan(a)
        cs = np.cumsum(mask, out=np.empty_like(a, int)[::-1])
        cs[n:] -= cs[:-n]
        out = cs > 0
        return out
    

    But this seems fragile and isn't really that much faster anyway.

    I wonder if there's a compiled signal processing function somewhere that does this exact operation..


    For sparse initial masks and small n this one is also pretty fast:

    def index_expansion(a, n):
        mask = np.isnan(a)
        idx = np.flatnonzero(mask)
        expanded_idx = idx[:,None] + np.arange(1, n)
        np.put(mask, expanded_idx, True, 'clip')
        return mask
    

提交回复
热议问题