Slice Pandas DataFrame by Row

匆匆过客 提交于 2019-12-18 10:33:58

问题


I am working with survey data loaded from an h5-file as hdf = pandas.HDFStore('Survey.h5') through the pandas package. Within this DataFrame, all rows are the results of a single survey, whereas the columns are the answers for all questions within a single survey.

I am aiming to reduce this dataset to a smaller DataFrame including only the rows with a certain depicted answer on a certain question, i.e. with all the same value in this column. I am able to determine the index values of all rows with this condition, but I can't find how to delete this rows or make a new df with these rows only.


回答1:


In [36]: df
Out[36]:
   A  B  C  D
a  0  2  6  0
b  6  1  5  2
c  0  2  6  0
d  9  3  2  2

In [37]: rows
Out[37]: ['a', 'c']

In [38]: df.drop(rows)
Out[38]:
   A  B  C  D
b  6  1  5  2
d  9  3  2  2

In [39]: df[~((df.A == 0) & (df.B == 2) & (df.C == 6) & (df.D == 0))]
Out[39]:
   A  B  C  D
b  6  1  5  2
d  9  3  2  2

In [40]: df.ix[rows]
Out[40]:
   A  B  C  D
a  0  2  6  0
c  0  2  6  0

In [41]: df[((df.A == 0) & (df.B == 2) & (df.C == 6) & (df.D == 0))]
Out[41]:
   A  B  C  D
a  0  2  6  0
c  0  2  6  0



回答2:


If you already know the index you can use .loc:

In [12]: df = pd.DataFrame({"a": [1,2,3,4,5], "b": [4,5,6,7,8]})

In [13]: df
Out[13]:
   a  b
0  1  4
1  2  5
2  3  6
3  4  7
4  5  8

In [14]: df.loc[[0,2,4]]
Out[14]:
   a  b
0  1  4
2  3  6
4  5  8

In [15]: df.loc[1:3]
Out[15]:
   a  b
1  2  5
2  3  6
3  4  7


来源:https://stackoverflow.com/questions/11881165/slice-pandas-dataframe-by-row

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!