Intersection of multiple pandas dataframes

狂风中的少年 提交于 2020-03-01 08:48:07

问题


I have a number of dataframes (100) in a list as:

frameList = [df1,df2,..,df100]

Each dataframe has the two columns DateTime, Temperature.

I want to intersect all the dataframes on the common DateTime column and get all their Temperature columns combined/merged into one big dataframe: Temperature from df1, Temperature from df2, Temperature from df3, .., Temperature from df100.

(pandas merge doesn't work as I'd have to compute multiple (99) pairwise intersections).


回答1:


Use pd.concat, which works on a list of DataFrames or Series.

pd.concat(frameList, axis=1, join='inner')

This is better than using pd.merge, as pd.merge will copy the data pairwise every time it is executed. pd.concat copies only once. However, pd.concat only merges based on an axes, whereas pd.merge can also merge on (multiple) columns.




回答2:


you can try using reduce functionality in python..something like this

dfs = [df0, df1, df2, dfN]
df_final = reduce(lambda left,right: pd.merge(left,right,on='DateTime'), dfs)



回答3:


You could iterate over your list like this:

df_merge = frameList[0]
for df in frameList[1:]:       
    df_merge = pd.merge(df_merge, df, on='DateTime', how='inner')


来源:https://stackoverflow.com/questions/40533467/intersection-of-multiple-pandas-dataframes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!