Merging DataFrames on multiple conditions - not specifically on equal values

前端 未结 2 981
梦如初夏
梦如初夏 2020-12-17 00:47

Firstly, sorry if this is a bit lengthy, but I wanted to fully describe what I have having problems with and what I have tried already.

I am trying to join (merge) t

2条回答
  •  执笔经年
    2020-12-17 01:37

    You can use the following to accomplish what you're looking for:

    merged_df=snp_df.merge(gene_df,on=['chromosome'],how='inner')
    merged_df=merged_df[(merged_df.BP>=merged_df.chr_start) & (merged_df.BP<=merged_df.chr_stop)][['SNP','feature_id']]
    

    Note: your example dataframes do not meet your join criteria. Here is an example using modified dataframes:

    snp_df
    Out[193]: 
       chromosome        SNP      BP
    0           1  rs3094315  752566
    1           1  rs3131972   30400
    2           1  rs2073814  753474
    3           1  rs3115859  754503
    4           1  rs3131956  758144
    
    gene_df
    Out[194]: 
       chromosome  chr_start  chr_stop        feature_id
    0           1      10954     11507  GeneID:100506145
    1           1      12190     13639  GeneID:100652771
    2           1      14362     29370     GeneID:653635
    3           1      30366     30503  GeneID:100302278
    4           1      34611     36081     GeneID:645520
    
    merged_df
    Out[195]: 
             SNP        feature_id
    8  rs3131972  GeneID:100302278
    

提交回复
热议问题