timezone conversion of a large list of timestamps from an excel file with python

試著忘記壹切 提交于 2019-12-13 08:29:21

问题


I have an excel file named "hello.xlsx". There is a column of timestamps that has a lot of rows (more than 80,000 rows for now). The file basically looks like this:

03/29/2018 19:24:50

03/29/2018 19:24:59

03/29/2018 19:24:59

03/29/2018 19:25:02

03/29/2018 19:25:06

03/29/2018 19:25:10

03/29/2018 19:25:20

03/29/2018 19:25:27

03/29/2018 19:25:27

03/29/2018 19:25:36

03/29/2018 19:25:49

And so on...

These timestamps are in UTC time, and I need to convert them to US Pacific Time (UTC, -7).

I searched online and tried to use some formulas within excel but failed to make it right. Then I wrote a piece of code as shown below:

df = pd.read_excel('hello1.xlsx', header=None)

df[0] = pd.to_datetime(df[0]).dt.astimezone(timezone('US/Pacific'))

df.to_excel('out.xlsx', index=False, header=False)

I tried running it but there appeared to be a problem. I think I need to change or add something to the second row of the code. I'm very new to python and I hope someone can help me figure it out I would really appreciate that. :)


回答1:


If you want to go the Python way, you'd have to use the apply method and also assign the times as UTC time before converting:

import pytz
df[0] = df[0].apply(lambda x: x.replace(tzinfo=pytz.utc).astimezone(pytz.timezone('US/Pacific')).replace(tzinfo=None))

The lambda operation does 3 things:

  1. Set the timezone for the time records to UTC.
  2. Convert to US/Pacific.
  3. Set back to naive time. You need this do step to export to Excel. Otherwise, Python will throw an error.

Your df will look like:

                     0
0  2018-03-29 12:24:50
1  2018-03-29 12:24:59
2  2018-03-29 12:24:59
3  2018-03-29 12:25:02
4  2018-03-29 12:25:06
5  2018-03-29 12:25:10
6  2018-03-29 12:25:20
7  2018-03-29 12:25:27
8  2018-03-29 12:25:27
9  2018-03-29 12:25:36
10 2018-03-29 12:25:49



回答2:


In Excel (and in many other data software) time data are kept as decimals, which the integer part is one day and the floating part is the ratio of a day. So you may basically subtract 7/24 (which is 7 hours in Excel's time data format) in order to convert a value from UTC to UTC,-7

For instance when your time data is in A1, try writing below formula to A2:

=A1-(7/24)

Edit for the format:

In order to see the formulated cell as date/time, we should be changing its format accordingly. Below format would work for this case:



来源:https://stackoverflow.com/questions/50137360/timezone-conversion-of-a-large-list-of-timestamps-from-an-excel-file-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!