How to drop column according to NAN percentage for dataframe?

后端 未结 3 969
深忆病人
深忆病人 2020-11-28 10:49

For certain columns of df, if 80% of the column is NAN.

What\'s the simplest code to drop such columns?

3条回答
  •  没有蜡笔的小新
    2020-11-28 11:14

    As suggested in comments, if you use sum() on a boolean test, you can get the number of occurences.

    Code:

    def get_nan_cols(df, nan_percent=0.8):
        threshold = len(df.index) * nan_percent
        return [c for c in df.columns if sum(df[c].isnull()) >= threshold]  
    

    Used as:

    del df[get_nan_cols(df, 0.8)]
    

提交回复
热议问题