问题
i have below times series data frames
i wanna delete rows on condtion (check everyday) : check aaa>100 then delete all day rows (in belows, delete all 2015-12-01 rows because aaa column last 3 have 1000 value)
....
date time aaa
2015-12-01,00:00:00,0
2015-12-01,00:15:00,0
2015-12-01,00:30:00,0
2015-12-01,00:45:00,0
2015-12-01,01:00:00,0
2015-12-01,01:15:00,0
2015-12-01,01:30:00,0
2015-12-01,01:45:00,0
2015-12-01,02:00:00,0
2015-12-01,02:15:00,0
2015-12-01,02:30:00,0
2015-12-01,02:45:00,0
2015-12-01,03:00:00,0
2015-12-01,03:15:00,0
2015-12-01,03:30:00,0
2015-12-01,03:45:00,0
2015-12-01,04:00:00,0
2015-12-01,04:15:00,0
2015-12-01,04:30:00,0
2015-12-01,04:45:00,0
2015-12-01,05:00:00,0
2015-12-01,05:15:00,0
2015-12-01,05:30:00,0
2015-12-01,05:45:00,0
2015-12-01,06:00:00,0
2015-12-01,06:15:00,0
2015-12-01,06:30:00,1000
2015-12-01,06:45:00,1000
2015-12-01,07:00:00,1000
....
how can i do it ?
回答1:
I think you need if MultiIndex
first compare values of aaa
by condition and then filter all values in first level by boolean indexing, last filter again by isin with inverted condition by ~
:
print (df)
aaa
date time
2015-12-01 00:00:00 0
00:15:00 0
00:30:00 0
00:45:00 0
2015-12-02 05:00:00 0
05:15:00 200
05:30:00 0
05:45:00 0
2015-12-03 06:00:00 0
06:15:00 0
06:30:00 1000
06:45:00 1000
07:00:00 1000
lvl0 = df.index.get_level_values(0)
idx = lvl0[df['aaa'].gt(100)].unique()
print (idx)
Index(['2015-12-02', '2015-12-03'], dtype='object', name='date')
df = df[~lvl0.isin(idx)]
print (df)
aaa
date time
2015-12-01 00:00:00 0
00:15:00 0
00:30:00 0
00:45:00 0
And if first column is not index only compare column date
:
print (df)
date time aaa
0 2015-12-01 00:00:00 0
1 2015-12-01 00:15:00 0
2 2015-12-01 00:30:00 0
3 2015-12-01 00:45:00 0
4 2015-12-02 05:00:00 0
5 2015-12-02 05:15:00 200
6 2015-12-02 05:30:00 0
7 2015-12-02 05:45:00 0
8 2015-12-03 06:00:00 0
9 2015-12-03 06:15:00 0
10 2015-12-03 06:30:00 1000
11 2015-12-03 06:45:00 1000
12 2015-12-03 07:00:00 1000
idx = df.loc[df['aaa'].gt(100), 'date'].unique()
print (idx)
['2015-12-02' '2015-12-03']
df = df[~df['date'].isin(idx)]
print (df)
date time aaa
0 2015-12-01 00:00:00 0
1 2015-12-01 00:15:00 0
2 2015-12-01 00:30:00 0
3 2015-12-01 00:45:00 0
来源:https://stackoverflow.com/questions/48556510/how-can-i-delete-whole-day-rows-on-condition-column-values-pandas