可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have a dataframe like this:
category date number 0 Cat1 2010-03-01 1 1 Cat2 2010-09-01 1 2 Cat3 2010-10-01 1 3 Cat4 2010-12-01 1 4 Cat5 2012-04-01 1 5 Cat2 2013-02-01 1 6 Cat3 2013-07-01 1 7 Cat4 2013-11-01 2 8 Cat5 2014-11-01 5 9 Cat2 2015-01-01 1 10 Cat3 2015-03-01 1
I would like to check if a date is exist in this dataframe but I am unable to. I tried various ways as below but still no use:
if pandas.Timestamp("2010-03-01 00:00:00", tz=None) in df['date'].values: print 'date exist' if datetime.strptime('2010-03-01', '%Y-%m-%d') in df['date'].values: print 'date exist' if '2010-03-01' in df['date'].values: print 'date exist'
The 'date exist' never got printed. How could I check if the date exist? Because I want to insert the none-existed date with number equals 0 to all the categories so that I could plot a continuously line chart (one category per line). Help is appreciated. Thanks in advance.
The last one gives me this: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
And the date exist
not get printed.
回答1:
I think you need convert to datetime first by to_datetime
and then if need select all rows use boolean indexing
:
df.date = pd.to_datetime(df.date) print (df.date == pd.Timestamp("2010-03-01 00:00:00")) 0 True 1 False 2 False 3 False 4 False 5 False 6 False 7 False 8 False 9 False 10 False Name: date, dtype: bool print (df[df.date == pd.Timestamp("2010-03-01 00:00:00")]) category date number 0 Cat1 2010-03-01 1
For return True
use check value converted to numpy array
by values
:
if ('2010-03-01' in df['date'].values): print ('date exist')
Or at least one True
by any
as comment Edchum:
if (df.date == pd.Timestamp("2010-03-01 00:00:00")).any(): print ('date exist')
回答2:
For example, to cofirm that the 4th value of ds
is contained within itself:
len(set(ds.isin([ds.iloc[3]]))) > 1
Let ds
be a Pandas DataSeries of the form [index, pandas._libs.tslib.Timestamp] with example values:
0 2018-01-31 19:08:27.465515 1 2018-02-01 19:08:27.465515 2 2018-02-02 19:08:27.465515 3 2018-02-03 19:08:27.465515 4 2018-02-04 19:08:27.465515
Then, we use the isin
local method to get a DataSeries of booleans where each entry indicates wether that position in ds
matches with the value passed as argument to the function (since isin
expects a list of values we need to provide the value in list format).
Next, we use the set
global method as to get a set with 1 or 2 values depending on wether there was a match (True and False values) or not (only a False value).
Finally, we check if the set contains more than 1 value, if that is the case, it means we have a match, and no match otherwise.