Filtering for rows in a Pandas dataframe containing at least one zero

∥☆過路亽.° 提交于 2020-01-06 07:11:52

问题


I am trying to delete all rows in a Pandas data frame that don't have a zero in either of two columns. My data frame is indexed from 0 to 620. This is my code:

for index in range(0, 621):
    if((zeroes[index,1] != 0) and (zeroes[index,3] != 0)):
        del(zeroes[index,])

I keep getting a key error. KeyError: (0, 1)

My instructor suggested I change the range to test to see if I have bad lines in my data frame. I did. I checked the tail of my dataframe and then changed the range to (616, 621). Then I got the key error: (616, 1).

Does anyone know what is wrong with my code or why I am getting a key error?

This code also produces a key error of (0,1):

index = 0
while (index < 621):
    if((zeroes[index,1] != 0) and (zeroes[index,3] != 0)):
        del(zeroes[index,])
index = index + 1

回答1:


Don't use a manual for loop here. Your error probably occurs because df.__getitem__((x, y)), which is effectively what df[x, y] calls, has no significance.

Instead, use vectorised operations and Boolean indexing. For example, to remove rows where either column 1 or 3 do not equal 0:

df = df[df.iloc[:, [1, 3]].eq(0).any(1)]

This works because eq(0) creates a dataframe of Boolean values indicating equality to zero and any(1) filters for rows with any True values.

The full form is df.iloc[:, [1, 3]].eq(0).any(axis=1), or df.iloc[:, [1, 3]].eq(0).any(axis='columns') for even more clarity. See the docs for pd.DataFrame.any for more details.



来源:https://stackoverflow.com/questions/52770559/filtering-for-rows-in-a-pandas-dataframe-containing-at-least-one-zero

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!