Retain few NA's and drop rest of NA's during Stack operation in Python

前端 未结 3 1338
悲&欢浪女
悲&欢浪女 2020-12-04 04:14

I have a dataframe like shown below

df2 = pd.DataFrame({\'person_id\':[1],\'H1_date\' : [\'2006-10-30 00:00:00\'], \'H1\':[2.3],\'H2_date\' : [\'2016-10-30          


        
3条回答
  •  忘掉有多难
    2020-12-04 04:45

    On approach is to melt the DF, apply a key that identifies columns in the same "group" (in this case H but you can amend that as required), then group by person and that key, filter those groups to those containing at least one non-NA value), eg:

    Starting with:

    df = pd.DataFrame({'person_id':[1],'H1_date' : ['2006-10-30 00:00:00'], 'H1':[2.3],'H2_date' : ['2016-10-30 00:00:00'], 'H2':[12.3],'H3_date' : ['2026-11-30 00:00:00'], 'H3':[22.3],'H4_date' : ['2106-10-30 00:00:00'], 'H4':[42.3],'H5_date' : [np.nan], 'H5':[np.nan],'H6_date' : ['2006-10-30 00:00:00'], 'H6':[2.3],'H7_date' : [np.nan], 'H7':[2.3],'H8_date' : ['2006-10-30 00:00:00'], 'H8':[np.nan]})
    

    Use:

    df2 = (
        df.melt(id_vars='person_id')
        .assign(_gid=lambda v: v.variable.str.extract('H(\d+)'))
        .groupby(['person_id', '_gid'])
        .filter(lambda g: bool(g.value.any()))
        .drop('_gid', 1)
    )
    

    Which gives you:

        person_id variable                value
    0           1  H1_date  2006-10-30 00:00:00
    1           1       H1                  2.3
    2           1  H2_date  2016-10-30 00:00:00
    3           1       H2                 12.3
    4           1  H3_date  2026-11-30 00:00:00
    5           1       H3                 22.3
    6           1  H4_date  2106-10-30 00:00:00
    7           1       H4                 42.3
    10          1  H6_date  2006-10-30 00:00:00
    11          1       H6                  2.3
    12          1  H7_date                  NaN
    13          1       H7                  2.3
    14          1  H8_date  2006-10-30 00:00:00
    15          1       H8                  NaN
    

    You can then use that as a starting point to tweak if necessary.

提交回复
热议问题