Python pandas: replace values multiple columns matching multiple columns from another dataframe

后端 未结 2 1475
清歌不尽
清歌不尽 2020-12-31 20:42

I searched a lot for an answer, the closest question was Compare 2 columns of 2 different pandas dataframes, if the same insert 1 into the other in Python, but the answer to

2条回答
  •  春和景丽
    2020-12-31 21:01

    Start by renaiming the columns you want to merge in df2

    df2.rename(columns={'OCHR':'chr','OSTOP':'pos'},inplace=True)
    

    Now merge on these columns

    df_merged = pd.merge(df1, df2, how='inner', on=['chr', 'pos']) # you might have to preserve the df1 index at this stage, not sure
    

    Next, you want to

    updater = df_merged[['D','CHR','STOP']] #this will be your update frame
    updater.rename( columns={'D':'snp','CHR':'chr','STOP':'pos'},inplace=True) # rename columns to update original
    

    Finally update (see bottom of this link):

    df1.update( df1_updater) #updates in place
    #  chr          snp  x    pos a1 a2
    #0   1  rs376643643  0  10040  G  A
    #1   1  rs373328635  0  10066  C  G
    #2   1   rs62651026  0  10208  C  G
    #3   1  rs376007522  0  10209  C  G
    #4   3  rs368469931  0  30247  C  T
    

    update works by matching index/column so you might have to string along the index of df1 for the entire process, then do df1_updater.re_index(... before df1.update(df1_updater)

提交回复
热议问题