drop duplicates pandas dataframe

我与影子孤独终老i 提交于 2020-01-15 05:04:08

问题


I am getting an error message when using drop_duplicates to drop duplicate columns from my dataframe.

ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

Below is a min example (notice that I don't have duplicate column names here, since that column won't be added then, so I var1 would be called var0 in my actual dataframe)

dict1 = [{'var0': 0, 'var1': 0, 'var2': 2},
         {'var0': 0, 'var1': 0, 'var2': 4},
         {'var0': 0, 'var1': 0, 'var2': 8},
         {'var0':0, 'var1': 0, 'var2': 12},]
df = pd.DataFrame(dict1, index=['s1', 's2','s1','s2'])
df.T.drop_duplicates().T

回答1:


The problem is with your indexing, when you transpose your DataFrame you will get duplicate column names which are messing it up. See below

dict1 = [{'var0': 0, 'var1': 0, 'var2': 2},
         {'var0': 0, 'var1': 0, 'var2': 4},
         {'var0': 0, 'var1': 0, 'var2': 8},
         {'var0':0, 'var1': 0, 'var2': 12},]
df = pd.DataFrame(dict1, index=['s1', 's2','s1','s2'])
df.reset_index().T.drop_duplicates().T.set_index('index')


来源:https://stackoverflow.com/questions/51470071/drop-duplicates-pandas-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!