I have 2 DataFrames df1 and df2 with the same column names [\'a\',\'b\',\'c\'] and indexed by dates. The date index can have similar values. I would like to create a DataFra
What you ask for is the join operation.
With the how argument, you can define how unique indices are handled.
Here, some article, which looks helpful concerning this point.
In the example below, I left out cosmetics (like renaming columns) for simplicity.
Code
import numpy as np
import pandas as pd
df1 = pd.DataFrame(np.random.randn(5,3), index=pd.date_range('01/02/2014',periods=5,freq='D'), columns=['a','b','c'] )
df2 = pd.DataFrame(np.random.randn(8,3), index=pd.date_range('01/01/2014',periods=8,freq='D'), columns=['a','b','c'] )
df3 = df1.join(df2, how='outer', lsuffix='_df1', rsuffix='_df2')
print(df3)
Output
a_df1 b_df1 c_df1 a_df2 b_df2 c_df2
2014-01-01 NaN NaN NaN 0.109898 1.107033 -1.045376
2014-01-02 0.573754 0.169476 -0.580504 -0.664921 -0.364891 -1.215334
2014-01-03 -0.766361 -0.739894 -1.096252 0.962381 -0.860382 -0.703269
2014-01-04 0.083959 -0.123795 -1.405974 1.825832 -0.580343 0.923202
2014-01-05 1.019080 -0.086650 0.126950 -0.021402 -1.686640 0.870779
2014-01-06 -1.036227 -1.103963 -0.821523 -0.943848 -0.905348 0.430739
2014-01-07 NaN NaN NaN 0.312005 0.586585 1.531492
2014-01-08 NaN NaN NaN -0.077951 -1.189960 0.995123