pandas-groupby

How to DataFrame.groupby along axis=1

好久不见. 提交于 2021-01-27 08:19:41
问题 I have: df = pd.DataFrame({'A':[1, 2, -3],'B':[1,2,6]}) df A B 0 1 1 1 2 2 2 -3 6 Q: How do I get: A 0 1 1 2 2 1.5 using groupby() and aggregate() ? Something like, df.groupby([0,1], axis=1).aggregate('mean') So basically groupby along axis=1 and use row indexes 0 and 1 for grouping. (without using Transpose) 回答1: Are you looking for ? df.mean(1) Out[71]: 0 1.0 1 2.0 2 1.5 dtype: float64 If you do want groupby df.groupby(['key']*df.shape[1],axis=1).mean() Out[72]: key 0 1.0 1 2.0 2 1.5 回答2:

How to DataFrame.groupby along axis=1

一曲冷凌霜 提交于 2021-01-27 08:17:28
问题 I have: df = pd.DataFrame({'A':[1, 2, -3],'B':[1,2,6]}) df A B 0 1 1 1 2 2 2 -3 6 Q: How do I get: A 0 1 1 2 2 1.5 using groupby() and aggregate() ? Something like, df.groupby([0,1], axis=1).aggregate('mean') So basically groupby along axis=1 and use row indexes 0 and 1 for grouping. (without using Transpose) 回答1: Are you looking for ? df.mean(1) Out[71]: 0 1.0 1 2.0 2 1.5 dtype: float64 If you do want groupby df.groupby(['key']*df.shape[1],axis=1).mean() Out[72]: key 0 1.0 1 2.0 2 1.5 回答2:

Pandas: Sort a dataframe based on multiple columns

て烟熏妆下的殇ゞ 提交于 2021-01-27 05:51:57
问题 I know that this question has been asked several times. But none of the answers match my case. I've a pandas dataframe with columns,department and employee_count. I need to sort the employee_count column in descending order. But if there is a tie between 2 employee_counts then they should be sorted alphabetically based on department. Department Employee_Count 0 abc 10 1 adc 10 2 bca 11 3 cde 9 4 xyz 15 required output: Department Employee_Count 0 xyz 15 1 bca 11 2 abc 10 3 adc 10 4 cde 9 This

Add column for percentage of total to Pandas dataframe

本小妞迷上赌 提交于 2021-01-21 12:33:51
问题 I have a dataframe that I am doing a groupby() on to get the counts on a column's values. I am trying to add an additional column for "Percentage of Total". I'm not sure how to accomplish that. I've looked at a few groupby options, but can't seem to find anything that fits. My dataframe looks like this: DAYSLATE DAYSLATE -7 days 1 -5 days 2 -3 days 8 -2 days 9 -1 days 45 0 days 589 1 days 33 2 days 8 3 days 16 4 days 14 5 days 16 6 days 2 7 days 6 8 days 2 9 days 2 10 days 1 回答1: Option 1 df[

Add column for percentage of total to Pandas dataframe

允我心安 提交于 2021-01-21 12:33:09
问题 I have a dataframe that I am doing a groupby() on to get the counts on a column's values. I am trying to add an additional column for "Percentage of Total". I'm not sure how to accomplish that. I've looked at a few groupby options, but can't seem to find anything that fits. My dataframe looks like this: DAYSLATE DAYSLATE -7 days 1 -5 days 2 -3 days 8 -2 days 9 -1 days 45 0 days 589 1 days 33 2 days 8 3 days 16 4 days 14 5 days 16 6 days 2 7 days 6 8 days 2 9 days 2 10 days 1 回答1: Option 1 df[

Groupby + conditional from another column to create new one

六眼飞鱼酱① 提交于 2021-01-07 03:43:14
问题 I am trying to capture the date of the “visit_num==2” of “users” in a new column ("2nd_visit_date") Here's the code (including the new column I want to create) df=pd.DataFrame({'user':[1,1,2,2,2,3,3,3,3,3,4,4], 'date':['1995-09-01','1995-09-02','1995-10-03','1995-10-04','1995-10-05','1995-11-07','1995-11-08','1995-11-09','1995-11-10','1995-11-15','1995-12-18','1995-12-20'], 'visit_num':[1,2,1,2,3,1,2,3,4,5,1,2], '2nd_visit_date':['1995-09-02','1995-09-02','1995-10-04','1995-10-04','1995-10-04

Groupby + conditional from another column to create new one

怎甘沉沦 提交于 2021-01-07 03:42:58
问题 I am trying to capture the date of the “visit_num==2” of “users” in a new column ("2nd_visit_date") Here's the code (including the new column I want to create) df=pd.DataFrame({'user':[1,1,2,2,2,3,3,3,3,3,4,4], 'date':['1995-09-01','1995-09-02','1995-10-03','1995-10-04','1995-10-05','1995-11-07','1995-11-08','1995-11-09','1995-11-10','1995-11-15','1995-12-18','1995-12-20'], 'visit_num':[1,2,1,2,3,1,2,3,4,5,1,2], '2nd_visit_date':['1995-09-02','1995-09-02','1995-10-04','1995-10-04','1995-10-04

How to fix this “TypeError: sequence item 0: expected str instance, float found”

穿精又带淫゛_ 提交于 2021-01-07 03:42:32
问题 I am trying to combine the cell values (strings) in a dataframe column using groupby method, separating the cell values in the grouped cell using commas. I ran into the following error: TypeError: sequence item 0: expected str instance, float found The error occurs on the following line of code, see the code block for complete codes: toronto_df['Neighbourhood'] = toronto_df.groupby(['Postcode','Borough'])['Neighbourhood'].agg(lambda x: ','.join(x)) It seems that in the groupby function, the

How can I fill gaps by mean in period datetime column in pandas dataframe?

夙愿已清 提交于 2021-01-05 07:07:41
问题 I have a dataframe like below: df = pd.DataFrame({'price': ['480,000,000','477,000,000', '608,700,000', '580,000,000', '350,000,000'], 'sale_date': ['1396/10/30','1396/10/30', '1396/11/01', '1396/11/03', '1396/11/07']}) df Out[7]: price sale_date 0 480,000,000 1396/10/30 1 477,000,000 1396/10/30 2 608,700,000 1396/11/01 3 580,000,000 1396/11/04 4 350,000,000 1396/11/04 So then i define period datetime and resample them by day df['sale_date']=df['sale_date'].str.replace('/','').astype(int) df[

How to show only column with Values in Pandas Groupby

随声附和 提交于 2021-01-03 07:08:43
问题 Hello Data Scientist and Pandas Experts, I need some help as I can’t get my data organized properly. Here is my data frame: df_dict = [ {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp1', 'duties': 'opening'}, \ {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp2', 'duties': 'deli'}, \ {'Date': Timestamp('2014-01-03 00:00:00'), 'Store': 'store1', 'employee': 'emp3', 'duties': 'cashier'},\ {'Date': Timestamp('2014-01-03 00:00:00'),