问题
Is there a good way to concatenate a list of DataFrames where the columns are not regular between DataFrames?
The desired outcome is to match up all columns that are a match but to keep the ones that have no match off to the side. The reason you would want to keep the unmatched columns is because while there may not be a match on a given column between the 1st and 2nd dataframes in the list there may be a match between the 1st and 3rd. Thus discarding prematurely on the first lack of match would not be ideal.
And example is:
print list(datalist[0].columns)
>>>[u'1', u'2', u'3']
print list(datalist[1].columns)
>>>[u'1', u'2', u'4']
print list(datalist[2].columns)
>>>[u'2', u'3', u'4']
Where the output would be a dataframe like (stylistically represented here):
1 2 3 -
1 2 - 4
- 2 3 4
回答1:
data=pd.concat(datalist,join='outer', axis=0, ignore_index=True)
This works. I was originally under the impression that concat with the join="outer" argument applied would just append straight up and down without regard to column names. Actually, when the join="outer" argument is applied it will combine what matching columns it can but then keep all of the non-matched columns off to the side of the DF, which is exactly what is desired. Hope this helps someone else.
来源:https://stackoverflow.com/questions/28842681/concatenating-multiple-dataframes-with-non-standard-columns