I have a DataFrame, say a volatility surface with index as time and column as strike. How do I do two dimensional interpolation? I can reindex
but how do i deal with NaN
? I know we can fillna(method='pad')
but it is not even linear interpolation. Is there a way we can plug in our own method to do interpolation?
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
You can use DataFrame.interpolate
to get a linear interpolation.
In : df = pandas.DataFrame(numpy.random.randn(5,3), index=['a','c','d','e','g']) In : df Out: 0 1 2 a -1.987879 -2.028572 0.024493 c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 g -1.632493 0.938456 0.492695 In : df2 = df.reindex(['a','b','c','d','e','f','g']) In : df2 Out: 0 1 2 a -1.987879 -2.028572 0.024493 b NaN NaN NaN c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 f NaN NaN NaN g -1.632493 0.938456 0.492695 In : df2.interpolate() Out: 0 1 2 a -1.987879 -2.028572 0.024493 b 0.052363 -1.729055 0.114652 c 2.092605 -1.429537 0.204811 d 0.767215 1.077814 0.565666 e -1.027733 1.330702 -0.490780 f -1.330113 1.134579 0.000958 g -1.632493 0.938456 0.492695
For anything more complex, you need to roll-out your own function that will deal with a Series
object and fill NaN
values as you like and return another Series
object.
回答2:
Old thread but thought I would share my solution with 2d extrapolation/interpolation, respecting index values, which also works on demand. Code ended up a bit weird so let me know if there is a better solution:
import pandas from numpy import nan import numpy dataGrid = pandas.DataFrame({1: {1: 1, 3: 2}, 2: {1: 3, 3: 4}}) def getExtrapolatedInterpolatedValue(x, y): global dataGrid if x not in dataGrid.index: dataGrid.ix[x] = nan dataGrid = dataGrid.sort() dataGrid = dataGrid.interpolate(method='index', axis=0).ffill(axis=0).bfill(axis=0) if y not in dataGrid.columns.values: dataGrid = dataGrid.reindex(columns=numpy.append(dataGrid.columns.values, y)) dataGrid = dataGrid.sort_index(axis=1) dataGrid = dataGrid.interpolate(method='index', axis=1).ffill(axis=1).bfill(axis=1) return dataGrid[y][x] print getExtrapolatedInterpolatedValue(2, 1.4) >>2.3