Message “Exception ignored” when dealing pandas.datetime type

烈酒焚心 提交于 2020-06-15 06:27:46

问题


I have a xlsx file with a column containing Dates in the format: "01.01.1900 09:01:25". The file is password protected so I convert it to a dataframe by means of win32com.client library.

Here is the code:

import pandas as pd
import win32com.client

xlApp = win32com.client.Dispatch("Excel.Application")
xlApp.DisplayAlerts = False
xlwb = xlApp.Workbooks.Open(File, False, True, None, " ") #Open Workbook password " "
xlws = xlwb.Sheets("Sheet 1") #Open Sheet 1        

#Get table dimensions 
LastRow = xlws.Range("A1").CurrentRegion.Rows.Count
LastColumn = xlws.Range("A1").CurrentRegion.Columns.Count
header=list((xlws.Range(xlws.Cells(1, 1), xlws.Cells(1, LastColumn)).Value)[0])
content = list(xlws.Range(xlws.Cells(2, 1), xlws.Cells(LastRow, LastColumn)).Value)
#Get the dataframe
df=pd.DataFrame(data=content, columns=header)
print (df)

I checked that once imported dtype as been automatically and correctly assigned to datetime64 for that column. The issue is that any time I try to do whatever with any value of that column (just print it or compare it) I get a meesage saying:

  File "pandas\_libs\tslibs\timezones.pyx", line 227, in pandas._libs.tslibs.timezones.get_dst_info

AttributeError: 'NoneType' object has no attribute 'total_seconds'

Exception ignored in: 'pandas._libs.tslib._localize_tso'
Traceback (most recent call last):
  File "pandas\_libs\tslibs\timezones.pyx", line 227, in pandas._libs.tslibs.timezones.get_dst_info
AttributeError: 'NoneType' object has no attribute 'total_seconds'
Traceback (most recent call last):

Nonetheless the code works perfectly, but the warning message is annoying me.

Is there anything I can do with the datatype to avoid that warning?


回答1:


Opening the excel in this way, the content variable is a list of tuples.

Having a look on those tuples there is a TimeZoneInfo that localizes all the dates in a kind of time zone, in my case "GMT Standard Time".

So once converted to a dataframe, when doing df.dtypes the result is not only "datetime64" but "datetime64 (UTC+0:00) Dublin, Edimburg, ..."

This time zone setting only happens when opening the excel file through win32com.client. If you removed the password, you can open it with pandas.read_excel and discover that there is no timezone set for those datetimes and the mentioned warning does not appear.

Don't know exactly the reason it happens, but I have a solution for the original example. The warning dissapears setting a timezone recognized by tz database as "UTC" or simply None. Something like:

df["col_name"]=df["col_name"].dt.tz_convert(None)


来源:https://stackoverflow.com/questions/51827582/message-exception-ignored-when-dealing-pandas-datetime-type

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!