Filling gaps in a numpy array

后端 未结 5 676
旧时难觅i
旧时难觅i 2020-12-05 01:39

I just want to interpolate, in the simplest possible terms, a 3D dataset. Linear interpolation, nearest neighbour, all that would suffice (this is to start off some algorith

5条回答
  •  一个人的身影
    2020-12-05 01:52

    You may try to tackle your problem like:

    # main ideas described in very high level pseudo code
    choose suitable base kernel shape and type (gaussian?)
    while true
        loop over your array (moving average manner)
            adapt your base kernel to current sparsity pattern
            set current value based on adapted kernel
        break if converged
    

    This actually can be implemented quite a straightforward manner (especially if performance is not a top concern).

    Obviously this is just heuristics and you need to do some experiments with your actual data to find proper adaptation scheme. When seeing kernel adaptation as kernel reweighing, you may like to do it based on how the values have been propagated. For example your weights for original supports are 1 and they decay related on which iteration they emerged.

    Also the determination of when this process has actually converged may be tricky one. Depending on the application it may be reasonable eventually to leave some 'gap regions' remain 'unfilled'.

    Update: Here is a very simple implementation along the lines *) described above:

    from numpy import any, asarray as asa, isnan, NaN, ones, seterr
    from numpy.lib.stride_tricks import as_strided as ast
    from scipy.stats import nanmean
    
    def _a2t(a):
        """Array to tuple."""
        return tuple(a.tolist())
    
    def _view(D, shape, strides):
        """View of flattened neighbourhood of D."""
        V= ast(D, shape= shape, strides= strides)
        return V.reshape(V.shape[:len(D.shape)]+ (-1,))
    
    def filler(A, n_shape, n_iter= 49):
        """Fill in NaNs from mean calculated from neighbour."""
        # boundary conditions
        D= NaN* ones(_a2t(asa(A.shape)+ asa(n_shape)- 1), dtype= A.dtype)
        slc= tuple([slice(n/ 2, -(n/ 2)) for n in n_shape])
        D[slc]= A
    
        # neighbourhood
        shape= _a2t(asa(D.shape)- asa(n_shape)+ 1)+ n_shape
        strides= D.strides* 2
    
        # iterate until no NaNs, but not more than n iterations
        old= seterr(invalid= 'ignore')
        for k in xrange(n_iter):
            M= isnan(D[slc])
            if not any(M): break
            D[slc][M]= nanmean(_view(D, shape, strides), -1)[M]
        seterr(**old)
        A[:]= D[slc]
    

    And a simple demonstration of the filler(.) on action, would be something like:

    In []: x= ones((3, 6, 99))
    In []: x.sum(-1)
    Out[]:
    array([[ 99.,  99.,  99.,  99.,  99.,  99.],
           [ 99.,  99.,  99.,  99.,  99.,  99.],
           [ 99.,  99.,  99.,  99.,  99.,  99.]])
    In []: x= NaN* x
    In []: x[1, 2, 3]= 1
    In []: x.sum(-1)
    Out[]:
    array([[ nan,  nan,  nan,  nan,  nan,  nan],
           [ nan,  nan,  nan,  nan,  nan,  nan],
           [ nan,  nan,  nan,  nan,  nan,  nan]])
    In []: filler(x, (3, 3, 5))
    In []: x.sum(-1)
    Out[]:
    array([[ 99.,  99.,  99.,  99.,  99.,  99.],
           [ 99.,  99.,  99.,  99.,  99.,  99.],
           [ 99.,  99.,  99.,  99.,  99.,  99.]])
    

    *) So here the nanmean(.) is just used to demonstrate the idea of the adaptation process. Based on this demonstration, it should be quite straightforward to implement a more complex adaptation and decaying weighing scheme. Also note that, no attention is paid to actual execution performance, but it still should be good (with reasonable input shapes).

提交回复
热议问题