For example, I have a pandas dataframe as follows:
col_1 col_2 col_3 col_4
a X 5 1
a Y 3 2
a Z 6 4
The following code does it:
import pandas as pd
def sum_group(df):
dfxz = df[df.col_2.isin(['X','Z'])]
sum_row = pd.Series(
[
df.col_1.iloc[0],
'NEW',
dfxz.col_3.sum(),
dfxz.col_4.sum()
], index = dfxz.columns)
return df.append(sum_row, ignore_index=True)
df = pd.DataFrame([['a', 'X', 5, 1],
['a', 'Y', 3, 2],
['a', 'Z', 6, 4],
['b', 'X', 7, 8],
['b', 'Y', 4, 3],
['b', 'Z', 6, 5]],
columns = ['col_1','col_2','col_3','col_4'])
df = df.groupby('col_1').apply(
sum_group,
).reset_index(drop=True)
print df
The apply method of the groupby object calls the function sum_group that returns a dataframe. The dataframes are then concatenated into a single dataframe. The sum_group concatenates the incoming dataframe with an additional row sum_row that contain the reduced version of the dataframe according to the criteria you stated.