问题
I have a xlsx file with a column containing Dates in the format: "01.01.1900 09:01:25". The file is password protected so I convert it to a dataframe by means of win32com.client library.
Here is the code:
import pandas as pd
import win32com.client
xlApp = win32com.client.Dispatch("Excel.Application")
xlApp.DisplayAlerts = False
xlwb = xlApp.Workbooks.Open(File, False, True, None, " ") #Open Workbook password " "
xlws = xlwb.Sheets("Sheet 1") #Open Sheet 1
#Get table dimensions
LastRow = xlws.Range("A1").CurrentRegion.Rows.Count
LastColumn = xlws.Range("A1").CurrentRegion.Columns.Count
header=list((xlws.Range(xlws.Cells(1, 1), xlws.Cells(1, LastColumn)).Value)[0])
content = list(xlws.Range(xlws.Cells(2, 1), xlws.Cells(LastRow, LastColumn)).Value)
#Get the dataframe
df=pd.DataFrame(data=content, columns=header)
print (df)
I checked that once imported dtype as been automatically and correctly assigned to datetime64 for that column. The issue is that any time I try to do whatever with any value of that column (just print it or compare it) I get a meesage saying:
File "pandas\_libs\tslibs\timezones.pyx", line 227, in pandas._libs.tslibs.timezones.get_dst_info
AttributeError: 'NoneType' object has no attribute 'total_seconds'
Exception ignored in: 'pandas._libs.tslib._localize_tso'
Traceback (most recent call last):
File "pandas\_libs\tslibs\timezones.pyx", line 227, in pandas._libs.tslibs.timezones.get_dst_info
AttributeError: 'NoneType' object has no attribute 'total_seconds'
Traceback (most recent call last):
Nonetheless the code works perfectly, but the warning message is annoying me.
Is there anything I can do with the datatype to avoid that warning?
回答1:
Opening the excel in this way, the content
variable is a list of tuples.
Having a look on those tuples there is a TimeZoneInfo that localizes all the dates in a kind of time zone, in my case "GMT Standard Time".
So once converted to a dataframe, when doing df.dtypes
the result is not only "datetime64" but "datetime64 (UTC+0:00) Dublin, Edimburg, ..."
This time zone setting only happens when opening the excel file through win32com.client
. If you removed the password, you can open it with pandas.read_excel
and discover that there is no timezone set for those datetimes and the mentioned warning does not appear.
Don't know exactly the reason it happens, but I have a solution for the original example. The warning dissapears setting a timezone recognized by tz database as "UTC"
or simply None
. Something like:
df["col_name"]=df["col_name"].dt.tz_convert(None)
来源:https://stackoverflow.com/questions/51827582/message-exception-ignored-when-dealing-pandas-datetime-type