pandas: Filling missing values within a group

前端 未结 2 1765
眼角桃花
眼角桃花 2020-12-15 08:54

I have some data from an experiment, and within each trial there are some single values, surrounded by NA\'s, that I want to fill out to the entire trial:

2条回答
  •  盖世英雄少女心
    2020-12-15 09:50

    An alternative approach is to use first_valid_index and a transform:

    In [11]: g = df.groupby('trial')
    
    In [12]: g['cs_name'].transform(lambda s: s.loc[s.first_valid_index()])
    Out[12]: 
    0     A1
    1     A1
    2     A1
    3     A1
    4     B2
    5     B2
    6     B2
    7     B2
    8     A1
    9     A1
    10    A1
    11    A1
    Name: cs_name, dtype: object
    

    This ought to be more efficient then using ffill followed by a bfill...

    And use this to change the cs_name column:

    df['cs_name'] = g['cs_name'].transform(lambda s: s.loc[s.first_valid_index()])
    

    Note: I think it would be nice enhancement to have a method to grab the first non-null object in the pandas, in numpy it's an open request, I don't think there is currently a method (I could be wrong!)...

提交回复
热议问题