Concatenating Multiple DataFrames with Non-Standard Columns

问题

Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames?

The desired outcome is to match up all columns that are a match but to keep the ones that have no match off to the side. The reason you would want to keep the unmatched columns is because while there may not be a match on a given column between the 1st and 2nd dataframes in the list there may be a match between the 1st and 3rd. Thus discarding prematurely on the first lack of match would not be ideal.

And example is:

print list(datalist[0].columns)
>>>[u'1', u'2', u'3']

print list(datalist[1].columns)
>>>[u'1', u'2', u'4']

print list(datalist[2].columns)
>>>[u'2', u'3', u'4']

Where the output would be a dataframe like (stylistically represented here):

1 2 3 - 
1 2 - 4
- 2 3 4

回答1:

data=pd.concat(datalist,join='outer', axis=0, ignore_index=True)

This works. I was originally under the impression that concat with the join="outer" argument applied would just append straight up and down without regard to column names. Actually, when the join="outer" argument is applied it will combine what matching columns it can but then keep all of the non-matched columns off to the side of the DF, which is exactly what is desired. Hope this helps someone else.

来源：https://stackoverflow.com/questions/28842681/concatenating-multiple-dataframes-with-non-standard-columns

标签

python

pandas

merge

concatenation

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!