Removing duplicates with ignoring case sensitive and adding the next column values with the first one in pandas dataframe in python

点点圈 提交于 2019-12-10 10:44:29

问题


I have a df,

Name    Count
Ram     1
ram     2
raM     1
Arjun   3
arjun   4

My desired output df,

Name    Count
Ram     4
Arjun   7

I tried groupby but I cannot achieve the desired output, please help


回答1:


Use agg by values of Names converted to lower - first and sum:

df = (df.groupby(df['Name'].str.lower(), as_index=False, sort=False)
        .agg({'Name':'first', 'Count':'sum'}))
print (df)
    Name  Count
0    Ram      4
1  Arjun      7

Detail:

print (df['Name'].str.lower())
0      ram
1      ram
2      ram
3    arjun
4    arjun
Name: Name, dtype: object



回答2:


In [71]: df.assign(Name=df['Name'].str.capitalize()).groupby('Name', as_index=False).sum()
Out[71]:
    Name  Count
0  Arjun      7
1    Ram      4



回答3:


If I group by title formatted strings, it simplifies the steps I must take.

df.Count.groupby(df.Name.str.title()).sum().reset_index()


来源:https://stackoverflow.com/questions/47095122/removing-duplicates-with-ignoring-case-sensitive-and-adding-the-next-column-valu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!