Spark: converting GMT time stamps to Eastern taking daylight savings into account

给你一囗甜甜゛ 提交于 2020-05-14 18:13:52

问题


I'm trying to convert a column of GMT timestamp strings into a column of timestamps in Eastern timezone. I want to take daylight savings into account.

My column of timestamp strings look like this:

'2017-02-01T10:15:21+00:00'

I figured out how to convert the string column into a timestamp in EST:

from pyspark.sql import functions as F

df2 = df1.withColumn('datetimeGMT', df1.myTimeColumnInGMT.cast('timestamp'))
df3 = df2.withColumn('datetimeEST', F.from_utc_timestamp(df2.datetimeGMT, "EST"))

But the times don't change with daylight savings. Is there another function or something that accounts for daylight savings with converting the timestamps?

EDIT: I think I figured it out. In the from_utc_timestamp call above, I needed to use "America/New_York" instead of "EST":

df3 = df2.withColumn('datetimeET', F.from_utc_timestamp(df2.datetimeGMT, "America/New_York"))

回答1:


I ended up figuring out the answer, so I figured I would add it here. I also think that this question/answer is worthwhile because while I was searching for this issue before posting the question, I couldn't find anything about daylight savings for spark. I probably should have realized that I should search for the underlying java functions.

The answer to the question ended up being to use the string "America/New_York" instead of "EST". This correctly applies daylight savings.

from pyspark.sql import functions as F
df3 = df2.withColumn('datetimeET', F.from_utc_timestamp(df2.datetimeGMT, "America/New_York"))

EDIT:

This link shows a list of available time zone strings that can be used in this way: https://garygregory.wordpress.com/2013/06/18/what-are-the-java-timezone-ids/



来源:https://stackoverflow.com/questions/45763587/spark-converting-gmt-time-stamps-to-eastern-taking-daylight-savings-into-accoun

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!