python - pandas - check if date exists in dataframe

匿名 (未验证) 提交于 2019-12-03 01:00:01

问题:

I have a dataframe like this:

      category  date            number 0      Cat1     2010-03-01      1 1      Cat2     2010-09-01      1 2      Cat3     2010-10-01      1 3      Cat4     2010-12-01      1 4      Cat5     2012-04-01      1 5      Cat2     2013-02-01      1 6      Cat3     2013-07-01      1 7      Cat4     2013-11-01      2 8      Cat5     2014-11-01      5 9      Cat2     2015-01-01      1 10     Cat3     2015-03-01      1 

I would like to check if a date is exist in this dataframe but I am unable to. I tried various ways as below but still no use:

if pandas.Timestamp("2010-03-01 00:00:00", tz=None) in df['date'].values:     print 'date exist'  if datetime.strptime('2010-03-01', '%Y-%m-%d') in df['date'].values:     print 'date exist'  if '2010-03-01' in df['date'].values:     print 'date exist'   

The 'date exist' never got printed. How could I check if the date exist? Because I want to insert the none-existed date with number equals 0 to all the categories so that I could plot a continuously line chart (one category per line). Help is appreciated. Thanks in advance.

The last one gives me this: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison And the date exist not get printed.

回答1:

I think you need convert to datetime first by to_datetime and then if need select all rows use boolean indexing:

df.date = pd.to_datetime(df.date)  print (df.date == pd.Timestamp("2010-03-01 00:00:00")) 0      True 1     False 2     False 3     False 4     False 5     False 6     False 7     False 8     False 9     False 10    False Name: date, dtype: bool  print (df[df.date == pd.Timestamp("2010-03-01 00:00:00")])   category       date  number 0     Cat1 2010-03-01       1 

For return True use check value converted to numpy array by values:

if ('2010-03-01' in df['date'].values):     print ('date exist') 

Or at least one True by any as comment Edchum:

if (df.date == pd.Timestamp("2010-03-01 00:00:00")).any():     print ('date exist')   


回答2:

For example, to cofirm that the 4th value of ds is contained within itself:

len(set(ds.isin([ds.iloc[3]]))) > 1 

Let ds be a Pandas DataSeries of the form [index, pandas._libs.tslib.Timestamp] with example values:

0 2018-01-31 19:08:27.465515 1 2018-02-01 19:08:27.465515 2 2018-02-02 19:08:27.465515 3 2018-02-03 19:08:27.465515 4 2018-02-04 19:08:27.465515

Then, we use the isin local method to get a DataSeries of booleans where each entry indicates wether that position in ds matches with the value passed as argument to the function (since isin expects a list of values we need to provide the value in list format).

Next, we use the set global method as to get a set with 1 or 2 values depending on wether there was a match (True and False values) or not (only a False value).

Finally, we check if the set contains more than 1 value, if that is the case, it means we have a match, and no match otherwise.



易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!