pandas-groupby | 易学教程

pandas groupby result using different combinations of boolean array as keys

阅读更多关于 pandas groupby result using different combinations of boolean array as keys

来源： https://stackoverflow.com/questions/63497901/pandas-groupby-result-using-different-combinations-of-boolean-array-as-keys

Group by function with pandas dataset. Cronbach's alpha with grouped dataset in Python

阅读更多关于 Group by function with pandas dataset. Cronbach's alpha with grouped dataset in Python

来源： https://stackoverflow.com/questions/63529481/group-by-function-with-pandas-dataset-cronbachs-alpha-with-grouped-dataset-in

Generate random no of days based on random selection of columns

阅读更多关于 Generate random no of days based on random selection of columns

问题 I have a dataframe like as shown below. Thanks to SO community for helping with the below df1 = pd.DataFrame({'person_id': [11,11, 12, 13, 14], 'date_birth': ['01/01/1961','12/30/1961', '05/29/1967', '01/01/1957', '7/27/1959']}) df1 = df1.melt('person_id', value_name='dates') df1['dates'] = pd.to_datetime(df1['dates']) df_ranges = df1.assign(until_prev_year_days=(df1['dates'].dt.dayofyear - 1), until_next_year_days=((df1['dates'] + pd.offsets.YearEnd(0)) - df1['dates']).dt.days) f = {'until

Generate random no of days based on random selection of columns

阅读更多关于 Generate random no of days based on random selection of columns

groupby cumulative in pandas then update using numpy based specific condition

阅读更多关于 groupby cumulative in pandas then update using numpy based specific condition

问题 I have a data frame as shown below. B_ID No_Show Session slot_num Patient_count 1 0.4 S1 1 1 2 0.3 S1 2 1 3 0.8 S1 3 1 4 0.3 S1 3 2 5 0.6 S1 4 1 6 0.8 S1 5 1 7 0.9 S1 5 2 8 0.4 S1 5 3 9 0.6 S1 5 4 12 0.9 S2 1 1 13 0.5 S2 1 2 14 0.3 S2 2 1 15 0.7 S2 3 1 20 0.7 S2 4 1 16 0.6 S2 5 1 17 0.8 S2 5 2 19 0.3 S2 5 3 From the above I would like to find the cumulative No_show by Session df['Cum_No_show'] = df.groupby(['Session'])['No_Show'].cumsum() No we get B_ID No_Show Session slot_num Patient_count

Pandas: map column using a dictionary on multiple columns

阅读更多关于 Pandas: map column using a dictionary on multiple columns

问题 I have a dataframe with None values in one column. I would like to replace this None values with the maximum value of the "category" for the same combination of other columns. Example: pandas dataframe import pandas as pd d = {'company': ['Company1', 'Company1', 'Company1', 'Company1', 'Company2', 'Company2'], 'product': ['Product A', 'Product A', 'Product F', 'Product A', 'Product F', 'Product F'], 'category': ['1', None, '3', '2', None, '5']} df = pd.DataFrame(d) company product category 0

Python - Grouping and Assigning Exception Rules

阅读更多关于 Python - Grouping and Assigning Exception Rules

问题 I would like to group by list first by assigning group 1, if the closest negative diff to 0 is Location 86 as Group 1, and I would like to assign Group 2 if the closest negative diff to 0 is Location 90. And then group 3 would be if Location 86 and 90 are the closest. After this set is run, I would rerun the code and anywhere a Group has not been assigned, it begins assigning starting from Group 4 and on, so as to not override the previous group assignments. The groupby is occurring based on

How do I use .loc with groupby so that creating a new column based on grouped data won't be considered a copy?

阅读更多关于 How do I use .loc with groupby so that creating a new column based on grouped data won't be considered a copy?

问题 I have a CSV file with groups of data, and am using the groupby() method to segregate them. Each group is processed by a bit of simple math that includes the use of min() and max() for a couple of columns, along with a bit of subtraction and multiplication to create a new column of data. I then graph each group. This mostly works okay, but I have two complaints about my code - graphs are individual, not combined as I would prefer; I get "SettingWithCopyWarning" with each group. From my

Combination of columns for aggregation after groupby

阅读更多关于 Combination of columns for aggregation after groupby

问题 Question Looking for something like df.groubpy('key').aggregate(combination(columnA, columnB)) instead of df['combination'] = combination(columnA, columnB) df.groupby('key')['combination'].aggregate() The only requirement is that the combination of columns is calculated after the groupby. Description I seems natural, logically wise, for some cases to first groupby and then aggregate. One example would be different aggregate functions for different combinations of columns that use the same

Dynamically merge lines that share the same key into one

阅读更多关于 Dynamically merge lines that share the same key into one

问题 I have a Dataframe and would like to make another column that combines the columns whose name begins with the same value in Answer and QID. That is to say, here is an exerpt of the dataframe: QID Category Text QType Question Answer0 Answer1 0 16 Automotive Access to car Single Do you have access to a car? I own a car/cars I own a car/cars 1 16 Automotive Access to car Single Do you have access to a car? I lease/ have a company car I lease/have a company car 2 16 Automotive Access to car