pandas dataframe count unique list

眉间皱痕 提交于 2021-02-05 07:00:46

问题


If the type of a column in dataframe is int, float or string, we can get its unique values with columnName.unique(). But what if this column is a list, e.g. [1, 2, 3]. How could I get the unique of this column?


回答1:


I think you can convert values to tuples and then unique works nice:

df = pd.DataFrame({'col':[[1,1,2],[2,1,3,3],[1,1,2],[1,1,2]]})
print (df)
            col
0     [1, 1, 2]
1  [2, 1, 3, 3]
2     [1, 1, 2]
3     [1, 1, 2]

print (df['col'].apply(tuple).unique())

[(1, 1, 2) (2, 1, 3, 3)]

L = [list(x) for x in df['col'].apply(tuple).unique()]
print (L)

[[1, 1, 2], [2, 1, 3, 3]]



回答2:


You cannot apply unique() on a non-hashable type such as list. You need to convert to a hashable type to do that.

A better solution using the latest version of pandas is to use duplicated() and you avoid iterating over the values to convert to list again.

df[~df.col.apply(tuple).duplicated()]

That would return as lists the unique values.



来源:https://stackoverflow.com/questions/47901307/pandas-dataframe-count-unique-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!