Pandas groupby and aggregation output should include all the original columns (including the ones not aggregated on)
I have the following data frame and want to: Group records by month Sum QTY_SOLD and NET_AMT of each unique UPC_ID (per month) Include the rest of the columns as well in the resulting dataframe The way I thought I can do this is 1st: create a month column to aggregate the D_DATES , then sum QTY_SOLD by UPC_ID . Script: # Convert date to date time object df['D_DATE'] = pd.to_datetime(df['D_DATE']) # Create aggregated months column df['month'] = df['D_DATE'].apply(dt.date.strftime, args=('%Y.%m',)) # Group by month and sum up quantity sold by UPC_ID df = df.groupby(['month', 'UPC_ID'])['QTY_SOLD