问题
I am using Pandas 0.18 and read_sas to load a sas7bdat dataset.
The dates in the Pandas dataframe appear as:
Out[56]:
0 19411.0
1 19325.0
2 19325.0
3 19443.0
4 19778.0
Name: sas_date, dtype: float64
pd.to_datetime does not recognize this format. What should I do parse the date correctly?
Thanks!
回答1:
According to this link,
[A] SAS date value is a value that represents the number of days between January 1, 1960, and a specified date
Therefore, if we convert the numbers to Pandas Timedeltas and add them to
1960-1-1 we can recover the date:
import numpy as np
import pandas as pd
ser = pd.Series([19411.0, 19325.0, 19325.0, 19443.0, 19778.0])
ser = pd.to_timedelta(ser, unit='D') + pd.Timestamp('1960-1-1')
yields
0 2013-02-22
1 2012-11-28
2 2012-11-28
3 2013-03-26
4 2014-02-24
dtype: datetime64[ns]
来源:https://stackoverflow.com/questions/36412864/convert-numeric-sas-date-to-datetime-in-pandas