问题
If the type of a column in dataframe is int
, float
or string
, we can get its unique values with columnName.unique()
.
But what if this column is a list, e.g. [1, 2, 3].
How could I get the unique
of this column?
回答1:
I think you can convert values to tuples and then unique
works nice:
df = pd.DataFrame({'col':[[1,1,2],[2,1,3,3],[1,1,2],[1,1,2]]})
print (df)
col
0 [1, 1, 2]
1 [2, 1, 3, 3]
2 [1, 1, 2]
3 [1, 1, 2]
print (df['col'].apply(tuple).unique())
[(1, 1, 2) (2, 1, 3, 3)]
L = [list(x) for x in df['col'].apply(tuple).unique()]
print (L)
[[1, 1, 2], [2, 1, 3, 3]]
回答2:
You cannot apply unique()
on a non-hashable type such as list. You need to convert to a hashable type to do that.
A better solution using the latest version of pandas is to use duplicated()
and you avoid iterating over the values to convert to list again.
df[~df.col.apply(tuple).duplicated()]
That would return as lists the unique values.
来源:https://stackoverflow.com/questions/47901307/pandas-dataframe-count-unique-list