Python - pandas datetime column with multiple timezones

爱⌒轻易说出口 提交于 2020-07-30 09:27:47

问题


I have a data frame with multiple users and timezones, like such:

cols = ['user', 'zone_name', 'utc_datetime']
data = [
    [1, 'Europe/Amsterdam', pd.to_datetime('2019-11-13 11:14:15')],
    [2, 'Europe/London', pd.to_datetime('2019-11-13 11:14:15')],
]

df = pd.DataFrame(data, columns=cols)

Based on this other post, I apply the following change to get the user local datetime:

df['local_datetime'] = df.groupby('zone_name')[
    'utc_datetime'
].transform(lambda x: x.dt.tz_localize(x.name))

Which outputs this:

    user    zone_name     utc_datetime          local_datetime
    1   Europe/Amsterdam  2019-11-13 11:14:15   2019-11-13 11:14:15+01:00
    2   Europe/London     2019-11-13 11:14:15   2019-11-13 11:14:15+00:00

However, the local_datetime column is an object and I cannot find a way to get it as datetime64[ns] and in the following format (desired output):

    user    zone_name     utc_datetime          local_datetime
    1   Europe/Amsterdam  2019-11-13 11:14:15   2019-11-13 12:14:15
    2   Europe/London     2019-11-13 11:14:15   2019-11-13 11:14:15

回答1:


I think you need Series.dt.tz_convert in lambda function:

df['local_datetime'] = (pd.to_datetime(df.groupby('zone_name')['utc_datetime']
    .transform(lambda x: x.dt.tz_localize('UTC').dt.tz_convert(x.name))
    .astype(str).str[:-6]))

print(df)
   user         zone_name        utc_datetime      local_datetime
0     1  Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 12:14:15
1     2     Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15



回答2:


Relatively shorter answer using DataFrame.apply:

df['local_datetime'] = df.apply(lambda x: x.utc_datetime.tz_localize(tz = "UTC").tz_convert(x.zone_name), axis = 1)
print(df)
   user         zone_name        utc_datetime             local_datetime
0     1  Europe/Amsterdam 2019-11-13 11:14:15  2019-11-13 12:14:15+01:00
1     2     Europe/London 2019-11-13 11:14:15  2019-11-13 11:14:15+00:00

If you want to remove the time zone information, you can localize times by passing None

df['local_datetime'] = df.apply(lambda x: x.utc_datetime.tz_localize(tz = "UTC").tz_convert(x.zone_name).tz_localize(None), axis = 1)
print(df)
   user         zone_name        utc_datetime      local_datetime
0     1  Europe/Amsterdam 2019-11-13 11:14:15 2019-11-13 12:14:15
1     2     Europe/London 2019-11-13 11:14:15 2019-11-13 11:14:15


来源:https://stackoverflow.com/questions/59967993/python-pandas-datetime-column-with-multiple-timezones

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!