Interpolation on DataFrame in pandas

匿名 (未验证) 提交于 2019-12-03 02:14:01

问题:

I have a DataFrame, say a volatility surface with index as time and column as strike. How do I do two dimensional interpolation? I can reindex but how do i deal with NaN? I know we can fillna(method='pad') but it is not even linear interpolation. Is there a way we can plug in our own method to do interpolation?

回答1:

You can use DataFrame.interpolate to get a linear interpolation.

In : df = pandas.DataFrame(numpy.random.randn(5,3), index=['a','c','d','e','g'])  In : df Out:           0         1         2 a -1.987879 -2.028572  0.024493 c  2.092605 -1.429537  0.204811 d  0.767215  1.077814  0.565666 e -1.027733  1.330702 -0.490780 g -1.632493  0.938456  0.492695  In : df2 = df.reindex(['a','b','c','d','e','f','g'])  In : df2 Out:           0         1         2 a -1.987879 -2.028572  0.024493 b       NaN       NaN       NaN c  2.092605 -1.429537  0.204811 d  0.767215  1.077814  0.565666 e -1.027733  1.330702 -0.490780 f       NaN       NaN       NaN g -1.632493  0.938456  0.492695  In : df2.interpolate() Out:           0         1         2 a -1.987879 -2.028572  0.024493 b  0.052363 -1.729055  0.114652 c  2.092605 -1.429537  0.204811 d  0.767215  1.077814  0.565666 e -1.027733  1.330702 -0.490780 f -1.330113  1.134579  0.000958 g -1.632493  0.938456  0.492695 

For anything more complex, you need to roll-out your own function that will deal with a Series object and fill NaN values as you like and return another Series object.



回答2:

Old thread but thought I would share my solution with 2d extrapolation/interpolation, respecting index values, which also works on demand. Code ended up a bit weird so let me know if there is a better solution:

import pandas from   numpy import nan import numpy  dataGrid = pandas.DataFrame({1: {1: 1, 3: 2},                              2: {1: 3, 3: 4}})   def getExtrapolatedInterpolatedValue(x, y):     global dataGrid     if x not in dataGrid.index:         dataGrid.ix[x] = nan         dataGrid = dataGrid.sort()         dataGrid = dataGrid.interpolate(method='index', axis=0).ffill(axis=0).bfill(axis=0)      if y not in dataGrid.columns.values:         dataGrid = dataGrid.reindex(columns=numpy.append(dataGrid.columns.values, y))         dataGrid = dataGrid.sort_index(axis=1)         dataGrid = dataGrid.interpolate(method='index', axis=1).ffill(axis=1).bfill(axis=1)      return dataGrid[y][x]   print getExtrapolatedInterpolatedValue(2, 1.4) >>2.3 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!