问题
I am trying to merge 2 dataframes together. Ironically, they started out as part of the same dataframe, but I am making baby steps -- sometimes in the wrong direction. Frame 1 looks like this:
Int64Index: 10730 entries, 0 to 10729 Data columns (total 6 columns): RegionID 10730 non-null int64 RegionName 10730 non-null object State 10730 non-null object Metro 10259 non-null object CountyName 10730 non-null object SizeRank 10730 non-null int64 dtypes: int64(2), object(4)
Frame 2 looks like this:
Int64Index: 10730 entries, 0 to 10729 Data columns (total 82 columns): 1996Q2 8218 non-null float64 1996Q3 8229 non-null float64 1996Q4 8235 non-null float64 ..... 2016Q1 10730 non-null float64 2016Q2 10730 non-null float64 2016Q3 10730 non-null float64 dtypes: float64(82)
Notice that the indexes are of the same type, and they even have the same number of rows.
I am trying to merge the dataframes back together like so:
df4 = pd.merge(df3, df2, how='inner', left_index=True, right_index=True)
The error I am getting is:
ValueError: can only call with other PeriodIndex-ed objects
The 2016Q1 and similarly named columns in the 2nd dataframe are of Period type, but I am not merging on them -- I thought as long as the indexes line up, merge should work? What am i doing wrong?
回答1:
Assuming we have the following DFs:
In [44]: df1
Out[44]:
1996Q2 2000Q3 2010Q4
0 1.5 3.5 1.000000
1 22.0 38.5 2.000000
2 15.0 35.0 4.333333
In [45]: df1.columns
Out[45]: PeriodIndex(['1996Q2', '2000Q3', '2010Q4'], dtype='period[Q-DEC]', freq='Q-DEC')
Notice: df1.columns
are of the PeriodIndex
dtype
In [46]: df2
Out[46]:
a b c
0 a1 b1 c1
1 a2 b2 c2
2 a3 b3 c3
In [47]: df2.columns
Out[47]: Index(['a', 'b', 'c'], dtype='object')
merge
and join
will return: ValueError: can only call with other PeriodIndex-ed objects
as, AFAIK, Pandas DF can't have a mixed column dtypes if some of them are of PeriodIndex
dtype:
In [48]: df1.join(df2)
...
skipped
...
ValueError: can only call with other PeriodIndex-ed objects
merge
throws the same exception:
In [54]: pd.merge(df1, df2, left_index=True, right_index=True)
...
skipped
...
ValueError: can only call with other PeriodIndex-ed objects
So we will have to convert df1.columns
to strings:
In [49]: df1.columns = df1.columns.values.astype(str)
In [50]: df1.columns
Out[50]: Index(['1996Q2', '2000Q3', '2010Q4'], dtype='object')
Now join
and merge
will work:
In [51]: df1.join(df2)
Out[51]:
1996Q2 2000Q3 2010Q4 a b c
0 1.5 3.5 1.000000 a1 b1 c1
1 22.0 38.5 2.000000 a2 b2 c2
2 15.0 35.0 4.333333 a3 b3 c3
In [52]: pd.merge(df1, df2, left_index=True, right_index=True)
Out[52]:
1996Q2 2000Q3 2010Q4 a b c
0 1.5 3.5 1.000000 a1 b1 c1
1 22.0 38.5 2.000000 a2 b2 c2
2 15.0 35.0 4.333333 a3 b3 c3
columns dtypes
for merged DF:
In [58]: df1.join(df2).columns
Out[58]: Index(['1996Q2', '2000Q3', '2010Q4', 'a', 'b', 'c'], dtype='object')
If you need df1.columns
as PeriodIndex
after the merging is done - you can save df1.columns
before you convert them and set them back after you are done with merging / joining:
In [60]: df1.columns
Out[60]: PeriodIndex(['1996Q2', '2000Q3', '2010Q4'], dtype='period[Q-DEC]', freq='Q-DEC')
In [61]: cols_saved = df1.columns
In [62]: df1.columns = df1.columns.values.astype(str)
In [63]: df1.columns
Out[63]: Index(['1996Q2', '2000Q3', '2010Q4'], dtype='object')
# merging (joining) or doing smth else here ...
In [64]: df1.columns = cols_saved
In [65]: df1.columns
Out[65]: PeriodIndex(['1996Q2', '2000Q3', '2010Q4'], dtype='period[Q-DEC]', freq='Q-DEC')
回答2:
I actually had the same issue and was getting integer columns as well.
Instead of
df1.columns = df1.columns.values.astype(str)
I used
df1.columns = df1.columns.format()
Hope this helps
来源:https://stackoverflow.com/questions/40499756/valueerror-can-only-call-with-other-periodindex-ed-objects