Finding count of distinct elements in DataFrame in each column

前端 未结 8 1158
半阙折子戏
半阙折子戏 2020-12-02 18:27

I am trying to find the count of distinct values in each column using Pandas. This is what I did.

import pandas as pd
import numpy as np

# Generate data.
NR         


        
相关标签:
8条回答
  • 2020-12-02 19:06

    Need to segregate only the columns with more than 20 unique values for all the columns in pandas_python:

    enter code here
    col_with_morethan_20_unique_values_cat=[]
    for col in data.columns:
        if data[col].dtype =='O':
            if len(data[col].unique()) >20:
    
            ....col_with_morethan_20_unique_values_cat.append(data[col].name)
            else:
                continue
    
    print(col_with_morethan_20_unique_values_cat)
    print('total number of columns with more than 20 number of unique value is',len(col_with_morethan_20_unique_values_cat))
    
    
    
     # The o/p will be as:
    ['CONTRACT NO', 'X2','X3',,,,,,,..]
    total number of columns with more than 20 number of unique value is 25
    
    0 讨论(0)
  • 2020-12-02 19:07

    Already some great answers here :) but this one seems to be missing:

    df.apply(lambda x: x.nunique())
    

    As of pandas 0.20.0, DataFrame.nunique() is also available.

    0 讨论(0)
提交回复
热议问题