drop_duplicates not working in pandas?

☆樱花仙子☆ 提交于 2019-11-29 14:30:37

You've got inplace=False so you're not modifying df. You want either

 df.drop_duplicates(subset=None, keep="first", inplace=True)

or

 df = df.drop_duplicates(subset=None, keep="first", inplace=False)

I have just had this issue, and this was not the solution.

It may be in the docs - I admittedly havent looked - and crucially this is only when dealing with date-based unique rows: the 'date' column must be formatted as such.

If the date data is a pandas object dtype, the drop_duplicates will not work - do a pd.to_datetime first.

The use of inplace=False tells pandas to return a new dataframe with duplicates dropped, so you need to assign that back to df:

df = df.drop_duplicates(subset=None, keep="first", inplace=False)

or inplace=True to tell pandas to drop duplicates in the current dataframe

df.drop_duplicates(subset=None, keep="first", inplace=True)
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!