Python pandas integer YYYYMMDD to datetime

后端 未结 1 1389
清酒与你
清酒与你 2020-12-13 13:23

Apologies in advance for this, but after two hours of searching and trying I cannot get the right answer here. I have a data frame, populated via pandas io sql.read_frame().

相关标签:
1条回答
  • 2020-12-13 13:46

    to_datetime accepts a format string:

    In [92]:
    
    t = 20070530
    pd.to_datetime(str(t), format='%Y%m%d')
    Out[92]:
    Timestamp('2007-05-30 00:00:00')
    

    example:

    In [94]:
    
    t = 20070530
    df = pd.DataFrame({'date':[t]*10})
    df
    Out[94]:
           date
    0  20070530
    1  20070530
    2  20070530
    3  20070530
    4  20070530
    5  20070530
    6  20070530
    7  20070530
    8  20070530
    9  20070530
    In [98]:
    
    df['DateTime'] = df['date'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))
    df
    Out[98]:
           date   DateTime
    0  20070530 2007-05-30
    1  20070530 2007-05-30
    2  20070530 2007-05-30
    3  20070530 2007-05-30
    4  20070530 2007-05-30
    5  20070530 2007-05-30
    6  20070530 2007-05-30
    7  20070530 2007-05-30
    8  20070530 2007-05-30
    9  20070530 2007-05-30
    In [99]:
    
    df.dtypes
    Out[99]:
    date                 int64
    DateTime    datetime64[ns]
    dtype: object
    

    EDIT

    Actually it's quicker to convert the type to string and then convert the entire series to a datetime rather than calling apply on every value:

    In [102]:
    
    df['DateTime'] = pd.to_datetime(df['date'].astype(str), format='%Y%m%d')
    df
    Out[102]:
           date   DateTime
    0  20070530 2007-05-30
    1  20070530 2007-05-30
    2  20070530 2007-05-30
    3  20070530 2007-05-30
    4  20070530 2007-05-30
    5  20070530 2007-05-30
    6  20070530 2007-05-30
    7  20070530 2007-05-30
    8  20070530 2007-05-30
    9  20070530 2007-05-30
    

    timings

    In [104]:
    
    %timeit df['date'].apply(lambda x: pd.to_datetime(str(x), format='%Y%m%d'))
    
    100 loops, best of 3: 2.55 ms per loop
    In [105]:
    
    %timeit pd.to_datetime(df['date'].astype(str), format='%Y%m%d')
    1000 loops, best of 3: 396 µs per loop
    
    0 讨论(0)
提交回复
热议问题