how to collapse columns in pandas on null values?

后端 未结 6 2004
醉酒成梦
醉酒成梦 2021-01-14 02:18

Suppose I have the following dataframe:

pd.DataFrame({\'col1\':    [\"a\", \"a\", np.nan, np.nan, np.nan],
            \'override1\': [\"b\", np.nan, \"b\",          


        
6条回答
  •  日久生厌
    2021-01-14 02:53

    A straightforward solution involves forward filling and picking off the last column. This was mentioned in the comments.

    df.ffill(1).iloc[:,-1].to_frame(name='collapsed')
    
      collapsed
    0         c
    1         a
    2         b
    3         c
    4       NaN
    

    If you're interested in performance, we can use a modified version of Divakar's justify function:

    pd.DataFrame({'collapsed': justify(
        df.values, invalid_val=np.nan, axis=1, side='right')[:,-1]
    })
    
      collapsed
    0         c
    1         a
    2         b
    3         c
    4       NaN
    

    Reference.

    def justify(a, invalid_val=0, axis=1, side='left'):    
        """
        Justifies a 2D array
    
        Parameters
        ----------
        A : ndarray
            Input array to be justified
        axis : int
            Axis along which justification is to be made
        side : str
            Direction of justification. It could be 'left', 'right', 'up', 'down'
            It should be 'left' or 'right' for axis=1 and 'up' or 'down' for axis=0.
    
        """
    
        if invalid_val is np.nan:
            mask = pd.notna(a)   # modified for strings
        else:
            mask = a!=invalid_val
        justified_mask = np.sort(mask,axis=axis)
        if (side=='up') | (side=='left'):
            justified_mask = np.flip(justified_mask,axis=axis)
        out = np.full(a.shape, invalid_val, dtype=a.dtype) 
        if axis==1:
            out[justified_mask] = a[mask]
        else:
            out.T[justified_mask.T] = a.T[mask.T]
        return out
    

提交回复
热议问题