Mapping a dataframe based on the columns from other dataframe

随声附和 提交于 2020-07-14 07:05:26

问题


I have two DataFrames. One looks like this:

df1.head()
#CHR    Start   End Name
chr1    141474  173862  SAP
chr1    745489  753092  ARB
chr1    762988  794826  SAS
chr1    1634175 1669127 ETH
chr1    2281853 2284259 BRB

And the second DataFrame looks as follows:

df2.head()
#chr    start   end
chr1    141477  173860
chr1    745500  753000
chr16   56228385    56229180
chr11   101785507   101786117
chr7    101961796   101962267

I am looking to map the first three columns from two DataFrames and create a new DataFrame, df3. For example, if #chr from both df1 and df2 are equal, then look for df2.start >= df1.start and df2.end <= df1.end.

If this is the case, print it out as the following:

df3.head()

#chr    start   end Name
chr1    141477  173860  SAP
chr1    745500  753000  ARB

So far I have tried to create a function for doing this:

def start_smaller_than_end(df1,df2):
    if df1.CHR == df2.CHR:
        df2.start >= df1.Start
        df2.End <= df2.End

    return df3

However, when I run it I get the following error:

df3(df1, df2)
name 'df3' is not defined

Any suggestions and help are greatly appreciated.


回答1:


I think you can use merge with boolean indexing:

df = pd.merge(df1, df2, how='outer', left_on='#CHR', right_on='#chr')

df = df[(df.start >= df.Start) & (df.end <= df.End)]
df = df[['#chr','start','end','Name']]
print (df)
   #chr   start     end Name
0  chr1  141477  173860  SAP
3  chr1  745500  753000  ARB

EDIT by comment:

Function start_smaller_than_end:

def start_smaller_than_end(df1,df2):
    df = pd.merge(df1, df2, how='outer', left_on='#CHR', right_on='#chr')
    df = df[(df.start >= df.Start) & (df.end <= df.End)]
    df = df[['#chr','start','end','Name']]
    return df

df3 = start_smaller_than_end(df1,df2)
print (df3)    


来源:https://stackoverflow.com/questions/39390612/mapping-a-dataframe-based-on-the-columns-from-other-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!