Convert a column of datetimes to epoch in Python

孤人 提交于 2019-12-17 19:15:53

问题


I'm currently having an issue with Python. I have a Pandas DataFrame and one of the columns is a string with a date. The format is :

"%Y-%m-%d %H:%m:00.000". For example : "2011-04-24 01:30:00.000"

I need to convert the entire column to integers. I tried to run this code, but it is extremely slow and I have a few million rows.

    for i in range(calls.shape[0]):
        calls['dateint'][i] = int(time.mktime(time.strptime(calls.DATE[i], "%Y-%m-%d %H:%M:00.000")))

Do you guys know how to convert the whole column to epoch time ?

Thanks in advance !


回答1:


convert the string to a datetime using to_datetime and then subtract datetime 1970-1-1 and call dt.total_seconds():

In [2]:
import pandas as pd
import datetime as dt
df = pd.DataFrame({'date':['2011-04-24 01:30:00.000']})
df

Out[2]:
                      date
0  2011-04-24 01:30:00.000

In [3]:
df['date'] = pd.to_datetime(df['date'])
df

Out[3]:
                 date
0 2011-04-24 01:30:00

In [6]:    
(df['date'] - dt.datetime(1970,1,1)).dt.total_seconds()

Out[6]:
0    1303608600
Name: date, dtype: float64

You can see that converting this value back yields the same time:

In [8]:
pd.to_datetime(1303608600, unit='s')

Out[8]:
Timestamp('2011-04-24 01:30:00')

So you can either add a new column or overwrite:

In [9]:
df['epoch'] = (df['date'] - dt.datetime(1970,1,1)).dt.total_seconds()
df

Out[9]:
                 date       epoch
0 2011-04-24 01:30:00  1303608600

EDIT

better method as suggested by @Jeff:

In [3]:
df['date'].astype('int64')//1e9

Out[3]:
0    1303608600
Name: date, dtype: float64

In [4]:
%timeit (df['date'] - dt.datetime(1970,1,1)).dt.total_seconds()
%timeit df['date'].astype('int64')//1e9

100 loops, best of 3: 1.72 ms per loop
1000 loops, best of 3: 275 µs per loop

You can also see that it is significantly faster




回答2:


From the Pandas documentation on working with time series data:

We subtract the epoch (midnight at January 1, 1970 UTC) and then floor divide by the “unit” (1 ms).

# generate some timestamps
stamps = pd.date_range('2012-10-08 18:15:05', periods=4, freq='D')

# convert it to milliseconds from epoch
(stamps - pd.Timestamp("1970-01-01")) // pd.Timedelta('1ms')

This will give the epoch time in milliseconds.



来源:https://stackoverflow.com/questions/35630098/convert-a-column-of-datetimes-to-epoch-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!