Summing rows based on keyword within index

[亡魂溺海] 提交于 2021-01-29 12:08:24

问题


I am trying to sum multiple rows together based on a keyword that is part of the index - but it is not the entire index. For example, the index could look like

                   Count
1234_Banana_Green   43
4321_Banana_Yellow  34
2244_Banana_Brown   23
12345_Apple_Red     45

I would like to sum all of the rows that have the same "keyword" within them and create a total "banana" row. Is there a way to do this without searching for the keyword "banana"? For my purposes, this keyword changes every time and I would like to be able to automate this summing process. Any help is very much appreciated.


回答1:


May be this:

df.groupby(df.index.to_series()
           .str.split('_', expand=True)[1]
          )['Count'].sum()

Output:

1
Apple      45
Banana    100
Name: Count, dtype: int64



回答2:


Given the following dataframe:

raw_data = {'id':    ['1234_Banana_Green', '4321_Banana_Yellow', 
                               '2244_Banana_Brown', '12345_Apple_Red', 
                               '1267_Apple_Blue']}

df = pd.DataFrame(raw_data).set_index(['id'])

Try this code:

df = df.reset_index()
df['extracted_keyword'] = df['id'].apply(lambda x: x.split('_')[1])
df.groupby(["extracted_keyword"]).count()

And gives:

                   id
extracted_keyword    
Apple               2
Banana              3

if you want restore the index, add in the end:

df = df.set_index(['id'])


来源:https://stackoverflow.com/questions/58082549/summing-rows-based-on-keyword-within-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!