How do you interpolate from an array containing datetime objects?

后端 未结 4 1893
既然无缘
既然无缘 2020-12-31 09:44

I\'m looking for a function analogous to np.interp that can work with datetime objects.

For example:

import datetime, numpy         


        
相关标签:
4条回答
  • 2020-12-31 10:08

    If you have/need sub-second precision in your timestamps, here's a slightly edited version of rchang's answer (basically just a different toTimestamp method)

    import datetime, numpy as np
    
    def toTimestamp(d):
      return d.timestamp()
    
    arr1 = np.array([toTimestamp(datetime.datetime(2000,1,2,3,4,5) + datetime.timedelta(0,d)) for d in np.linspace(0,1,9)]) 
    arr2 = np.arange(1,10) # 1, 2, ..., 9
    
    result = np.interp(toTimestamp(datetime.datetime(2000,1,2,3,4,5,678901)),arr1,arr2)
    print(result) # Prints 6.431207656860352
    

    I can't say anything about timezone issues, as I haven't tested this with other timezones.

    0 讨论(0)
  • 2020-12-31 10:15

    You can convert them to timestamps (edited to reflect the use of calendar.timegm to avoid timezone-related pitfalls).

    # Python 2.7
    import datetime, numpy as np
    import calendar
    
    def toTimestamp(d):
      return calendar.timegm(d.timetuple())
    
    arr1 = np.array([toTimestamp(datetime.datetime(2008,1,d)) for d in range(1,10)]) 
    arr2 = np.arange(1,10)
    
    result = np.interp(toTimestamp(datetime.datetime(2008,1,5,12)),arr1,arr2)
    print result # Prints 5.5
    
    0 讨论(0)
  • 2020-12-31 10:19

    numpy.interp() function expects that arr1 and arr2 are 1D sequences of floats i.e., you should convert the sequence of datetime objects to 1D sequence of floats if you want to use np.interp().

    If input data uses the same UTC offset for all datetime objects then you could get a float by subtracting a reference date from all values. It is true if your input is UTC (the offset is always zero):

    from datetime import datetime
    import numpy as np
    
    arr1 = np.array([datetime(2008, 1, d) for d in range(1, 10)])
    arr2 = np.arange(1, 10)
    
    def to_float(d, epoch=arr1[0]):
        return (d - epoch).total_seconds()
    
    f = np.interp(to_float(datetime(2008,1,5,12)), map(to_float, arr1), arr2)
    print f # -> 5.5
    
    0 讨论(0)
  • 2020-12-31 10:29

    I'm providing this as a complement to @rchang's answer for those wanting to do this all in Pandas. This function takes a pandas series containing dates and returns a new series with the values converted to 'number of days' after a specified date.

    def convert_dates_to_days(dates, start_date=None, name='Day'):
        """Converts a series of dates to a series of float values that
        represent days since start_date.
        """
    
        if start_date:
            ts0 = pd.Timestamp(start_date).timestamp()
        else:
            ts0 = 0
    
        return ((dates.apply(pd.Timestamp.timestamp) - 
                ts0)/(24*3600)).rename(name)
    

    Not sure it will work with times or if it is immune to the time-zone pitfalls mentioned above. But I think as long as you provide a start date in the same time zone, which is subtracted from all the timestamp values, you should be okay.

    Here's how I used it:

    from scipy.interpolate import interp1d
    
    data = pd.DataFrame({
        'Date': pd.date_range('2018-01-01', '2018-01-22', freq='7D'),
        'Value': np.random.randn(4)
    })
    
    x = convert_dates_to_days(data.Date, start_date='2018-01-01')
    y = data.Value
    f2 = interp1d(x, y, kind='cubic')
    
    all_dates = pd.Series(pd.date_range('2018-01-01', '2018-01-22'))
    x_all = convert_dates_to_days(all_dates, start_date='2018-01-01')
    
    plt.plot(all_dates, f2(x_all), '-')
    data.set_index('Date')['Value'].plot(style='o')
    plt.grid()
    plt.savefig("interp_demo.png")
    plt.show()
    

    It seems to work...

    0 讨论(0)
提交回复
热议问题