Get unique values of multiple columns as a new dataframe in pandas

不羁岁月 提交于 2020-02-13 07:50:47

问题


Having pandas data frame df with at least columns C1,C2,C3 how would you get all the unique C1,C2,C3 values as a new DataFrame?

in other words, similiar to :

SELECT C1,C2,C3
FROM T
GROUP BY C1,C2,C3

Tried that

print df.groupby(by=['C1','C2','C3'])

but im getting

<pandas.core.groupby.DataFrameGroupBy object at 0x000000000769A9E8>

回答1:


I believe you need drop_duplicates if want all unique triples:

df = df.drop_duplicates(subset=['C1','C2','C3'])

If want use groupby add first:

df = df.groupby(by=['C1','C2','C3'], as_index=False).first()


来源:https://stackoverflow.com/questions/48131812/get-unique-values-of-multiple-columns-as-a-new-dataframe-in-pandas

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!