Pandas - “time data does not match format ” error when the string does match the format?

风格不统一 提交于 2020-06-23 11:03:44

问题


I'm getting a value error saying my data does not match the format when it does. Not sure if this is a bug or I'm missing something here. I'm referring to this documentation for the string format. The weird part is if I write the 'data' Dataframe to a csv and read it in then call the function below it will convert the date so I'm not sure why it doesn't work without writing to a csv.

Any ideas?

data['Date'] = pd.to_datetime(data['Date'], format='%d-%b-%Y')  

I'm getting two errors

 TypeError: Unrecognized value type: <class 'str'>  

 ValueError: time data '27‑Aug‑2018' does not match format '%d-%b-%Y' (match) 

Example dates -

2‑Jul‑2018     
27‑Aug‑2018    
28‑May‑2018    
19‑Jun‑2017    
5‑Mar‑2018     
15‑Jan‑2018    
11‑Nov‑2013    
23‑Nov‑2015    
23‑Jun‑2014    
18‑Jun‑2018    
30‑Apr‑2018    
14‑May‑2018    
16‑Apr‑2018    
26‑Feb‑2018    
19‑Mar‑2018    
29‑Jun‑2015   

Is it because they all aren't double digit days? What is the string format value for single digit days? Looks like this could be the cause but I'm not sure why it would error on the '27' though.

End solution (It was unicode & not a string) -

data['Date'] = data['Date'].apply(unidecode.unidecode)  
data['Date'] = data['Date'].apply(lambda x: x.replace("-", "/"))    
data['Date'] = pd.to_datetime(data['Date'], format="%d/%b/%Y")

回答1:


There seems to be an issue with your date strings. I replicated your issue with your sample data and if I remove the hyphens and replace them manually (for the first three dates) then the code works

pd.to_datetime(df1['Date'] ,errors ='coerce')

output:

0    2018-07-02
1    2018-08-27
2    2018-05-28
3           NaT
4           NaT
5           NaT
6           NaT
7           NaT
8           NaT
9           NaT
10          NaT
11          NaT
12          NaT
13          NaT
14          NaT
15          NaT

Bottom line: your hyphens look like regular ones but are actually something else, just clean your source data and you're good to go




回答2:


You got a special mark here it is not -

df.iloc[0,0][2]
Out[287]: '‑'

Replace it with '-'

pd.to_datetime(df.iloc[:,0].str.replace('‑','-'),format='%d-%b-%Y')
Out[288]: 
0    2018-08-27
1    2018-05-28
2    2017-06-19
3    2018-03-05
4    2018-01-15
5    2013-11-11
6    2015-11-23
7    2014-06-23
8    2018-06-18
9    2018-04-30
10   2018-05-14
11   2018-04-16
12   2018-02-26
13   2018-03-19
14   2015-06-29
Name: 2‑Jul‑2018, dtype: datetime64[ns]


来源:https://stackoverflow.com/questions/53101655/pandas-time-data-does-not-match-format-error-when-the-string-does-match-the

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!