pandas dataframe count unique list

问题

If the type of a column in dataframe is int, float or string, we can get its unique values with columnName.unique(). But what if this column is a list, e.g. [1, 2, 3]. How could I get the unique of this column?

回答1:

I think you can convert values to tuples and then unique works nice:

df = pd.DataFrame({'col':[[1,1,2],[2,1,3,3],[1,1,2],[1,1,2]]})
print (df)
            col
0     [1, 1, 2]
1  [2, 1, 3, 3]
2     [1, 1, 2]
3     [1, 1, 2]

print (df['col'].apply(tuple).unique())

[(1, 1, 2) (2, 1, 3, 3)]

L = [list(x) for x in df['col'].apply(tuple).unique()]
print (L)

[[1, 1, 2], [2, 1, 3, 3]]

回答2:

You cannot apply unique() on a non-hashable type such as list. You need to convert to a hashable type to do that.

A better solution using the latest version of pandas is to use duplicated() and you avoid iterating over the values to convert to list again.

df[~df.col.apply(tuple).duplicated()]

That would return as lists the unique values.

来源：https://stackoverflow.com/questions/47901307/pandas-dataframe-count-unique-list

标签

python

pandas

dataframe

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!