df.groupby(…).agg(set) produces different result compared to df.groupby(…).agg(lambda x: set(x))
Answering this question it turned out that df.groupby(...).agg(set) and df.groupby(...).agg(lambda x: set(x)) are producing different results. Data: df = pd.DataFrame({ 'user_id': [1, 2, 3, 4, 1, 2, 3], 'class_type': ['Krav Maga', 'Yoga', 'Ju-jitsu', 'Krav Maga', 'Ju-jitsu','Krav Maga', 'Karate'], 'instructor': ['Bob', 'Alice','Bob', 'Alice','Alice', 'Alice','Bob']}) Demo: In [36]: df.groupby('user_id').agg(lambda x: set(x)) Out[36]: class_type instructor user_id 1 {Krav Maga, Ju-jitsu} {Alice, Bob} 2 {Yoga, Krav Maga} {Alice} 3 {Ju-jitsu, Karate} {Bob} 4 {Krav Maga} {Alice} In [37]: df