How to merge/combine columns in pandas?

后端 未结 4 2001
日久生厌
日久生厌 2020-12-09 18:35

I have a (example-) dataframe with 4 columns:

data = {\'A\': [\'a\', \'b\', \'c\', \'d\', \'e\', \'f\'],
    \'B\': [42, 52, np.nan, np.nan, np.nan, np.nan],         


        
4条回答
  •  攒了一身酷
    2020-12-09 19:27

    The question as written asks for merge/combine as opposed to sum, so posting this to help folks who find this answer looking for help on coalescing with combine_first, which can be a bit tricky.

    df2 = pd.concat([df["A"], 
                 df["B"].combine_first(df["C"]).combine_first(df["D"])], 
                axis=1)
    df2.rename(columns={"B":"E"}, inplace=True)
       A     E
    0  a  42.0
    1  b  52.0
    2  c  31.0
    3  d  2.0 
    4  e  62.0
    5  f  70.0
    

    What's so tricky about that? in this case there's no problem - but let's say you were pulling the B, C and D values from different dataframes, in which the a,b,c,d,e,f labels were present, but not necessarily in the same order. combine_first() aligns on the index, so you'd need to tack a set_index() on to each of your df references.

    df2 = pd.concat([df.set_index("A", drop=False)["A"], 
                 df.set_index("A")["B"]\
                 .combine_first(df.set_index("A")["C"])\
                 .combine_first(df.set_index("A")["D"]).astype(int)], 
                axis=1).reset_index(drop=True)
    df2.rename(columns={"B":"E"}, inplace=True)
    
       A   E
    0  a  42
    1  b  52
    2  c  31
    3  d  2 
    4  e  62
    5  f  70
    

提交回复
热议问题