问题
How can I merge two different dataframes, keeping all rows from each dataframe while filling in the blanks?
DF1
Name Addr Num Parent Parent_Addr
Matt 123H 8 James 543F
Adam 213H 9 James 543F
James 321H 10 Mom 654F
Andrew 512F 10 Dad 665F
Faith 555A 7 None 657F
DF2
Name Parent Parent_Num Parent_Addr
Matt James 10 543F
Adam James 10 543F
James Mom 12 654F
None Ian 13 656F
None None None 1234
Expected output
Name Addr Num Parent Parent_Num Parent_Addr
Matt 123H 8 James 10 543F
Adam 213H 9 James 10 543F
James 321H 10 Mom 12 654F
Andrew 512F 10 Dad None 665F
Faith 555A 7 None None 657F
None None None Ian 13 656F
None None None None None 1234
I am attempting to merge and keep all data from both dataframes. Any help would be greatly appreciated. THank you.
回答1:
You need to merge on all the common columns and use outer join
pd.merge(df1, df2, on = ['Name', 'Parent', 'Parent_Addr'], how = 'outer')
Name Addr Num Parent Parent_Addr Parent_Num
0 Matt 123H 8 James 543F 10
1 Adam 213H 9 James 543F 10
2 James 321H 10 Mom 654F 12
3 Andrew 512F 10 Dad 665F NaN
4 Faith 555A 7 None 657F NaN
5 None NaN NaN Ian 656F 13
6 None NaN NaN None 1234 None
回答2:
You can keep all the rows with an 'outer' merge
note that by default merge will join on all common column names.
df1.merge(df2, 'outer')
Name Addr Num Parent Parent_Addr Parent_Num
0 Matt 123H 8.0 James 543F 10
1 Adam 213H 9.0 James 543F 10
2 James 321H 10.0 Mom 654F 12
3 Andrew 512F 10.0 Dad 665F NaN
4 Faith 555A 7.0 None 657F NaN
5 None NaN NaN Ian 656F 13
6 None NaN NaN None 1234 None
来源:https://stackoverflow.com/questions/42940507/merging-dataframes-keeping-all-items-pandas