How to count the number of categorical features with Pandas?

房东的猫 提交于 2020-12-25 09:56:19

问题


I have a pd.DataFrame which contains different dtypes columns. I would like to have the count of columns of each type. I use Pandas 0.24.2.

I tried:

    dataframe.dtypes.value_counts()

It worked fine for other dtypes (float64, object, int64) but for a weird reason, it doesn't aggregate the 'category' features, and I get a different count for each category (as if they would be counted as different values of dtypes).

I also tried:

    dataframe.dtypes.groupby(by=dataframe.dtypes).agg(['count'])

But that raises a

TypeError: data type not understood.

Reproductible example:

import pandas as pd

df = pd.DataFrame([['A','a',1,10], ['B','b',2,20], ['C','c',3,30]], columns = ['col_1','col_2','col_3','col_4'])

df['col_1'] = df['col_1'].astype('category')
df['col_2'] = df['col_2'].astype('category')

print(df.dtypes.value_counts())

Expected result:

    int64       2
    category    2
    dtype: int64

Actual result:

    int64       2
    category    1
    category    1
    dtype: int64

回答1:


As @jezrael mentioned that it is deprecated in 0.25.0, dtypes.value_counts(0) would give two categoryies, so to fix it do:

print(df.dtypes.astype(str).value_counts())

Output:

int64       2
category    2
dtype: int64



回答2:


Use DataFrame.get_dtype_counts:

print (df.get_dtype_counts())
category    2
int64       2
dtype: int64

But if use last version of pandas your solution is recommended:

Deprecated since version 0.25.0.

Use .dtypes.value_counts() instead.



来源:https://stackoverflow.com/questions/57213786/how-to-count-the-number-of-categorical-features-with-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!