Pandas DataFrame, How do I remove all columns and rows that sum to 0

后端 未结 3 1616
無奈伤痛
無奈伤痛 2020-12-17 18:03

I have a dataFrame with rows and columns that sum to 0.

    A   B   C    D
0   1   1   0    1
1   0   0   0    0 
2   1   0   0    1
3   0   1   0    0  
4           


        
相关标签:
3条回答
  • 2020-12-17 18:48

    df.loc[row_indexer, column_indexer] allows you to select rows and columns using boolean masks:

    In [88]: df.loc[(df.sum(axis=1) != 0), (df.sum(axis=0) != 0)]
    Out[88]: 
       A  B  D
    0  1  1  1
    2  1  0  1
    3  0  1  0
    4  1  1  1
    
    [4 rows x 3 columns]
    

    df.sum(axis=1) != 0 is True if and only if the row does not sum to 0.

    df.sum(axis=0) != 0 is True if and only if the column does not sum to 0.

    0 讨论(0)
  • 2020-12-17 18:55

    This is my way to do it:

    import pandas as pd 
    hl = []
    df =  pd.read_csv("my.csv")
    l = list(df.columns.values)
    for l in l:
        if sum(df[l]) != 0:
            hl.append(l)
    df2 = df[hl]
    

    to write reduced_Data:

    df2.to_csv("my_reduced_data.csv")
    

    It will only check columns but ignore Rows

    0 讨论(0)
  • 2020-12-17 19:03

    building on Drop rows with all zeros in pandas data frame to avoid using the sum()

    df = pd.DataFrame({'A': [1,0,1,0,1],
                       'B': [1,0,0,1,1],
                       'C': [0,0,0,0,0],
                       'D': [1,0,1,0,1]})
    
    df.loc[(df!=0).any(1), (df!=0).any(0)]
    
       A  B  D
    0  1  1  1
    2  1  0  1
    3  0  1  0
    4  1  1  1
    
    0 讨论(0)
提交回复
热议问题